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CLONING OF CYTOCHROME P450 GENES FROM NICOTIANA 

The present invention relates to nucleic acid 
sequences encoding cytochrome p450 enzymes (hereinafter 
referred to as p450 and p450 enzymes) in Nicotians plants 
and methods for using those nucleic acid sequences to 
alter plant phenotypes . 

BACKGROUND 

Cytochrome p450s catalyze enzymatic reactions for 
a diverse range of chemically dissimilar substrates that 
include the oxidative, peroxidative and reductive 
metabolism of endogenous and xenobiotic substrates. In 
plants, p450s participate in biochemical pathways that 
include the synthesis of plant products such as 
phenylpropanoids, alkaloids, terpenoids, lipids, 
cyanogenic glycosides, and glucosinolates (Chappel, Annu. 
Rev. Plant Physiol. Plant Mol. Biol. 198, 49:311-343) . 
Cytochrome p450s, also known as p450 heme-thiolate 
proteins, usually act as terminal oxidases in multi- 
component electron transfer chains, called p450- 
containing monooxygenase systems. Specific reactions 
catalyzed include demethylation, hydroxylation, 
epoxidation, N-oxidation, sulf ooxidation, N-, S-, and O- 
dealkylations, desulfation, deamination, and reduction of 
azo, nitro, and N-oxide groups. 

The diverse role of Nicotiana plant p450 enzymes has 
been implicated in effecting a variety of plant 
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metabolites such as phenylpropanoids, alkaloids, 
terpenoids, lipids, cyanogenic glycosides, glucosinolates 
and a host of other chemical entities. During recent 
years, it is becoming apparent that some p450 enzymes can 
impact the composition of plant metabolites in plants. 
For example, it has been long desired to improve the 
flavor and aroma of certain plants by altering its 
profile of selected fatty acids through breeding; however 
very little is known about mechanisms involved in 
controlling the levels of these leaf constituents. The 
down regulation of p450 enzymes associated with the 
modification of fatty acids may facilitate accumulation 
of desired fatty acids that provide more preferred leaf 
phenotypic qualities. The function of p450 enzymes and 
their broadening roles in plant constituents is still 
being discovered. For instance, a special class of p450 
enzymes was found to catalyze the breakdown of fatty acid 
into volatile C6- and C9-aldehydes and -alcohols that are 
major contributors of "fresh green" odor of 'fruits and 
vegetables. The level of other novel targeted p450s may 
be altered to enhance the qualities of leaf constituents 
by modifying lipid composition and related break down 
metabolites in Nicotiana leaf. Several of these 
constituents in leaf are affected by senescence that 
stimulates the maturation of leaf quality properties. 
Still other reports have shown that p4 50s enzymes are 
play 3 functional role in altering fatty acids that are 
involved in plant-pathogen interactions and disease 
resistance. 



2 



WO 2004/035745 



PCT/US2003/032722 



In other instances, p450 enzymes have been suggested, 
to be involved in alkaloid biosynthesis . Nornicotine is 
a minor alkaloid found in Nicotiana tabaceum. It has 
been postulated that it is produced by the p450 mediated 
demethylation of nicotine followed by acylation and 
nitrosation at the N position thereby producing a series 
of N-acylnonicotines and N-nitrosonornicotines . N- 
demethylation, catalyzed by a putative p450 demethylase, 
is thought to be a primary source of nornicotine 
biosyntheses in Nicotiana. While the enzyme is believed 
to be microsomal, thus far a nicotine demethylase enzyme 
has not been successfully purified, nor have the genes 
involved been isolated. 

Furthermore, it is hypothesized but not proven that 
the activity of p450 , enzymes is genetically controlled 
and also strongly influenced by environment factors. For 
example, the demethylation of nicotine in Nicotiana is 
thought to increase substantially when the plants reach 
a mature stage. Furthermore, it is hypothesized yet not 
proven that the demethylase gene contains a transposable 
element that can inhibit translation of RNA when present. 

The large multiplicity of p450 enzyme forms, their 
differing structure and function have made their research 
on Nicotiana p450 enzymes very difficult before the 
enclosed invention. In addition, cloning of p450 enzymes 
has been hampered at least in part because these 
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membrane-localized proteins are typically present in low 
abundance and often unstable to purification. Hence, a 
need exists for the identification of p450 enzymes in 
plants and the nucleic acid sequences associated with 
those p450 enzymes. In particular, only a few cytochrome 
p450 proteins have been reported in Nicotiana. The 
inventions described herein entail the discovery of a 
substantial number of cytochrome p450 fragments that 
correspond to several groups of p450 species based on 
their sequence identity. 

SUMMARY 

The present invention is directed to plant p450 
enzymes. The present invention is further directed to 
plant p450 enzymes from Nicotiana. The present invention 
is also directed to p450 enzymes in plants whose 
expression is induced by ethylene and/or plant 
senescence. The present invention is yet further 
directed to nucleic acid sequences in plants having 
enzymatic activities, for example, being categorized as 
oxygenase, demethylase and the like, or other and the use 
of those sequences to reduce or silence the expression or 
over-expression of these enzymes. The invention also 
relates to p450 enzymes found in plants containing higher 
nornicotine levels than plants exhibiting lower 
nornicotine levels. 

In one aspect, the invention is directed to nucleic 
acid sequences as set forth in SEQ. ID. Nos. 1, 3, 5, 7, 
9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 
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37, 


39, 41 


, 43, 


45, 


47, 49, 51 


, 53, 


55, 


57, 59 


, 61, 


63, 


65, 


67, 69 


, 71, 


73, 


75, 77, 79 


, 81, 


83, 


85, 87 


, 89, 


91, 


95, 


97, 99 


, 101 


, 103 


, 105, 107, 


109, 


111, 


113, 


115, 


117, 


119, 


121, 


123, 


125, 


127, 129, 


131, 


133, 


135, 


137, 


139, 


143, 


145, 


147, 


149, 


151, 153, 


155, 


157, 


159, 


161, 


163, 


165, 


167, 


169, 


171, 


173, 175, 


177, 


179, 


181, 


183, 


185, 


187, 


189, 


191, 


193, 


195, 197, 


199, 


201, 


203, 


205, 


207, 


209, 


211, 


213, 


215, 


217, 219, 


221, 


223, 


225, 


227, 


229, 


231, 


233, 


235, 


237, 


239, 241, 


243, 


245, 


247, 


249, 


251, 


253, 


255, 


257, 


259, 


261, 263, 


265, 


267, 


269, 


271, 


273, 


275, 


277, 


279, 


281, 


283, 285, 287, 289, 291, 293, 295 and 


297 . 





















In a second related aspect, those fragments 
containing greater than 75% identity in nucleic acid 
sequence were placed into groups dependent upon their 
identity in a region corresponding to the first nucleic 
acid following the cytochrome p450 motif GXRXCX (G/A) to 
the stop codon. The representative, nucleic acid groups 
and respective species are shown in Table I . 

In a third aspect, the invention is directed to 
amino acid sequences as set forth in SEQ. ID, Nos. 2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 
36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 
92, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 
118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 
140, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 
164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 
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186, 


188, 


190, 


192, 


194, 


196, 


198, 


200, 


202, 


204, 


206, 


208, 


210, 


212, 


214, 


216, 


218, 


220, 


222, 


224, 


226, 


228, 


230, 


232, 


234, 


236, 


238, 


240, 


242, 


244, 


246, 


248, 


250, 


252, 


254, 


256, 


258, 


260, 


262, 


264, 


266, 


268, 


270, 


272, 


274, 


276, 


278, 


280, 


282, 


284, 


286, 


288, 


290, 


292, 


294, 


296 


and 298. 



















In a fourth related aspect, those fragments 
containing greater than 71% identity in amino acid 
sequence were placed into groups dependent upon their 
identity to each other in a region corresponding to the 
first amino acid following the cytochrome p450 motif 
GXKXCX(G/A) to the stop codon. The representative amino 
acid groups and respective species are shown in Table II. 

In a fifth aspect, the invention is directed to 
amino acid sequences of full length genes as set forth in 
SEQ. ID. Nos. 150, 152, 154, 156, 158, 160, 162, 164, 
166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 
188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 
210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 
232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 
254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 
276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296 and 
298. 

In a sixth related aspect, those full length genes 
containing 85% or greater identity in amino acid sequence 
were placed into groups dependent upon the identity to 



6 



WO 2004/035745 



PCT/US2003/032722 



each, other. The representative amino acid groups and 
respective species are shown in Table III. 

In a seventh aspect, the invention is directed to 
amino acid sequences of the fragments set forth in SEQ. 
ID. Nos. 299-357. 

In the eighth related aspect, those fragments 
containing 90% or greater identity in amino acid sequence 
were placed into groups dependent upon their identity to 
each other in a region corresponding to the first 
cytochrome p450 domain, UXXRXXZ, to the third cytochrome 
domain, GXRXO, where U is E or K, X is any amino acid and 
Z is R, T, S or M. The representative amino acid groups 
respective species shown in Table IV. 

In a ninth related aspect, the reduction or 
elimination or over-expression of p450 enzymes in 
Nicotiana plants may be accomplished transiently using 
RNA viral systems. 

Resulting transformed or infected plants are 
assessed for phenotypic changes including, but not 
limited to, analysis of endogenous p450 RNA transcripts, 
p450 expressed peptides, and concentrations of plant 
metabolites using techniques commonly available to one 
having ordinary skill in the art. 

In a tenth important aspect, the present invention 
is also directed to generation of trangenic Nicotiana 
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lines that have altered p450 enzyme activity levels. In 
accordance with the invention, these transgenic lines 
include nucleic acid sequences that are effective for 
reducing or silencing or increasing the expression of 
certain enzyme thus resulting in phenotypic effects 
within Nicotiana. Such nucleic acid sequences include 
SEQ. ID. Nos. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 
25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 
53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 
81, 83, 85, 87, 89, 91, 95, 97, 99, 101, 103, 105, 107, 
109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 
131, 133, 135, 137, 139, 143, 145, 147, 149, 151, 153, 
155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 
177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 
199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 
221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 
243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 
265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 
287, 289, 291, 293, 295 and 297. 

In a very important eleventh aspect of the 
invention, plant cultivars including nucleic acids of the 
present invention in a down regulation capacity using 
either full length genes or fragments thereof or in an 
over-expression capacity using full length genes will 
have altered metabolite profiles relative to control 
plants . 

In a twelfth aspect of the invention, plant 
cultivars including nucleic acid of the present invention 
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using either full length genes or fragments thereof in 
modifying the biosynthesis or breakdown of metabolites 
derived from the plant or external to the plants, will 
have use in tolerating certain exogenous chemicals or 
plant pests. Such nucleic acid sequences include SEQ ID. 





1. 3. 


5, 7 


, 9, 


11, 13, 15, 


17, 


19, 


21, 23 


, 25, 


27, 


29, 


31, 33 


, 35, 


37, 


39, 41, 43, 


45, 


47, 


49, 51 


, 53, 


55, 


57, 


59, 61 


, 63, 


65, 


67, 69, 71, 


73, 


75, 


77, 79 


, 81, 


83, 


85, 


87, 89, 


, 91, 


95, ! 


97, 99 


, 101, 


103, 


105, 


, 107, 


109, 


111, 


113, 


115, 


117, 


119, 


121, 


123, 


125, 


127, 


129, 


131, 


133, 


135, 


137, 


139, 


143, 


145, 


147, 


149, 


151, 


153, 


155, 


157, 


159, 


161, 


163, 


165, 


167, 


169, 


171, 


173, 


175, 


177, 


179, 


181, 


183, 


185, 


187, 


189, 


191, 


193, 


195, 


197, 


199, 


201, 


203, 


205, 


207, 


209, 


211, 


213, 


215, 


217, 


219, 


221, 


223, 


225, 


227, 


229, 


231, 


233, 


235, 


237, 


239, 


241, 


243, 


245, 


247, 


249, 


251, 


253, 


255, 


257, 


259, 


261, 


263, 


265, 


267, 


269, 


271, 


273, 


275, 


277, 


279, 


281, 


283, 


285, 


287, 


289, 


291, 


293, 


2 95 and 297. 















In a thirteenth aspect, the present invention is 
directed to the screening of plants, more preferably 
Nicotiana, that contain genes that have substantial 
nucleic acid identity to the taught nucleic acid 
sequence. The use of the invention would be advantageous 
to identify and select plants that contain a nucleic acid 
sequence with exact or substantial identity where such 
plants are part of a breeding program for traditional or 
transgenic varieties, a mutagenesis program, or naturally 
occurring diverse plant populations. The screening of 
ants for substantial nucleic acid identity may be 
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accomplished by evaluating plant nucleic acid materials 
using a nucleic acid probe in conjunction with nucleic 
acid detection protocols including, but not limited to, 
nucleic acid hybridization and PCR analysis. The nucleic 
acid probe may consist of the taught nucleic acid 
sequence or fragment thereof corresponding to SEQ ID 1, 
3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 
61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 
89, 91, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 
115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 
137, 139, 143, 145, 147, 149, 151, 153, 155, 157, 159, 
161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 
183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 
205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 
227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 
249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 
271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 
293, 295 and 297. 

In a fourteenth aspect, the present invention is 
directed to the identification of plant genes, more 
preferably Nicotiana, that share substantial amino acid 
identity corresponding to the taught nucleic acid 
sequence. The identification of plant genes including 
both cDNA and genomic clones, those cDNAs and genomic 
clones, more preferably from Nicotiana may be 
accomplished by screening plant cDNA libraries using a 
nucleic acid probe in conjunction with nucleic acid 
detection protocols including, but not limited to, 
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nucleic acid hybridization and PCR analysis. The nucleic 
acid probe may be comprised of nucleic acid sequence or 
•fragment thereof corresponding to SEQ ID 1, 3, 5, 7, 9, 
11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 
67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 95, 
97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 
121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 143, 
145 and 147. 

In an alterative fifteenth aspect, cDNA expression 
libraries that express peptides may be screened using 
antibodies directed to part or all of the taught amino 
acid sequence. Such amino acid sequences include SEQ ID 
2, 4, 8, 9, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 
32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 
88, 90, 92, 96, 98, 100, 102, 104, 106, 108, 110, 112, 
114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 
136, 138, 140, 144, 146, 148. 

In a sixteenth important aspect, the present 
invention is also directed to generation of transgenic 
Nicotiana lines that have over-expression of p450 enzyme 
activity levels. In accordance with the invention, these 
transgenic lines include all nucleic acid sequences 
encoding the amino acid sequences of full length genes 
that are effective for increasing the expression of 
certain enzyme thus resulting in phenotypic effects 
within Nicotiana. Such amino acid sequences include SEQ. 
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ID. 


150, 


152, 


154, 


156, 


158, 


160, 


162, 


164, 


166, 


168, 


170, 


172, 


174, 


176, 


178, 


180, 


182, 


184, 


186, 


188, 


190, 


192, 


194, 


196, 


198, 


200, 


202, 


204, 


206, 


208, 


210, 


212, 


214, 


216, 


218, 


220, 


222, 


224, 


226, 


228, 


230, 


232, 


234, 


236, 


238, 


240, 


242, 


244, 


246, 


248, 


250, 


252, 


254, 


256, 


258, 


260, 


262, 


264, 


266, 


268, 


270, 


272, 


274, 


276, 


278, 


280, 


282, 


284, 


286, 


288, 


290, 


292, 


294, 


296 and 298. 



BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 shows nucleic acid SEQ. ID. No . : 1 and amino 
acid SEQ. ID. No. :2 . 

Figure 2 shows nucleic acid SEQ. ID. No . : 3 and amino 
acid SEQ. ID. No . : 4 . 

Figure 3 shows nucleic acid SEQ. ID. No.: 5 and amino 
acid SEQ. ID. No . s 6 . 

Figure 4 shows nucleic acid SEQ. ID. No.: 7 and amino 
acid SEQ. ID. No . : 8 . 

Figure 5 shows nucleic acid SEQ. ID. No.: 9 and amino 
acid SEQ. ID. No. :10. 

Figure 6 shows nucleic acid SEQ. ID. No.:ll and 
amino acid SEQ. ID. No.: 12. 

Figure 7 shows nucleic acid SEQ. ID. No.:13 and 
amino acid SEQ. ID. No.: 14. 

Figure 8 shows nucleic acid SEQ. ID. No.:15 and 
amino acid SEQ. ID. No.: 16. 

Figure 9 shows nucleic acid SEQ. ID. No,: 17 and 
amino acid SEQ. ID. No.: 18. 

Figure 10 shows nucleic acid SEQ. ID. No.:19 and 
amino acid SEQ. ID. No.:20. 

Figure 11 shows nucleic acid SEQ. ID. No.:21 and 
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amino acid SEQ. ID. No.: 22. 

Figure 12 shows nucleic acid SEQ. ID. No.: 23 and 
amino acid SEQ. ID. No.: 24. 

Figure 13 shows nucleic acid SEQ. ID. No.:25 and 
amino acid SEQ. ID. No.: 26. 

Figure 14 shows nucleic acid SEQ. ID. No.:27 and 
amino acid SEQ. ID. No.: 28. 

Figure 15 shows nucleic acid SEQ. ID. No.:29 and 
amino acid SEQ. ID. No.:30. 

Figure 16 shows nucleic acid SEQ. ID. No.:31 and 
amino acid SEQ. ID. No.: 32. 

Figure 17 shows nucleic acid SEQ. ID. No.: 33 and 
amino acid SEQ. ID. No.: 34. 

Figure 18 shows nucleic acid SEQ. ID. No.:35 and 
amino acid SEQ. ID. No.: 36. 

Figure 19 shows nucleic acid SEQ. ID. No.:37 and 
amino acid SEQ. ID. No.: 38. 

Figure 20 shows nucleic acid SEQ. ID. No.:39 and 
amino acid SEQ. ID. No.: 40. 

Figure 21 shows nucleic acid SEQ. ID. No.:41 and 
amino acid SEQ. ID. No.: 42. 

Figure 22 shows nucleic acid SEQ. ID. No . : 43 and 
amino acid SEQ. ID. No.: 44. 

Figure 23 shows nucleic acid SEQ. ID. No.:45 and 
amino acid SEQ. ID. No.: 46. 

Figure 24 shows nucleic acid SEQ. ID. No.:47 and 
amino acid SEQ. ID. No.:48. 

Figure 25 shows nucleic acid SEQ. ID. No.:49 and 
amino acid SEQ. ID. No.: 50. 

Figure 26 shows nucleic acid SEQ. ID. No.:51 and 
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amino acid SEQ. ID- No.: 52. 

Figure 27 shows nucleic 
amino acid SEQ. ID. No.: 54. 

Figure 2 8 shows nucleic 
amino acid SEQ. ID. No.: 56. 

Figure 29 shows nucleic 
amino acid SEQ. ID. No.: 58. 

Figure 30 shows nucleic 
amino acid SEQ. ID. No.: 60. 

Figure '31 shows nucleic 
amino acid SEQ. ID. No.: 62. 

Figure 32 shows nucleic 
amino acid SEQ. ID. No.: 64. 

Figure 33 shows nucleic 
amino acid SEQ. ID. No.: 66. 

Figure 34 shows nucleic 
amino acid SEQ. ID. No.: 68. 

Figure 35 shows nucleic 
amino acid SEQ. ID. No.: 70. 

Figure 36 shows nucleic 
amino acid SEQ. ID. No.: 72. 

Figure 37 shows nucleic 
amino acid SEQ. ID. No.: 74. 

Figure 3 8 shows nucleic 
amino acid SEQ. ID. No.: 76. 

Figure 39 shows nucleic 
amino acid SEQ. ID. No.: 78. 

Figure 40 shows nucleic 
amino acid SEQ. ID. No.: 80. 

Figure 41 shows nucleic 



acid SEQ. ID. No.: 53 and 
acid SEQ. ID. No.:55 and 
acid SEQ. ID. No.: 57 and 
acid SEQ. ID. No.: 59 and 
acid SEQ. ID. No.: 61 and 
acid SEQ. ID. No . : 63 and 
acid SEQ. ID. No.: 65 and 
acid SEQ. ID. No . : 67 and 
acid SEQ. ID. No.: 69 and 
acid SEQ. ID. No.: 71 and 
acid SEQ. ID. No.:73 and 
acid SEQ. ID. No.: 75 and 
acid SEQ. ID. No.: 77 and 
acid SEQ. ID. No.:79 and 
acid SEQ. ID. No.: 81 and 
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amino acid SEQ. ID. No.: 82. 

Figure 42 shows nucleic acid SEQ. ID. No. : 83 and 
amino acid SEQ. ID. No.: 84. 

Figure 43 shows nucleic acid SEQ. ID . No. : 8 5 and 
amino acid SEQ. ID. No.: 86. 

Figure 44 shows nucleic acid SEQ. ID. No.: 87 and 
amino acid SEQ. ID. No.: 88. 

Figure 45 shows nucleic acid SEQ. ID. No.:89 and 
amino acid SEQ. ID. No.: 90. 

Figure 46 shows nucleic acid SEQ, ID. No.:91 and 
amino acid SEQ. ID. No.: 92. 

Figure 48 shows nucleic acid SEQ. ID. No.: 95 and 
amino acid SEQ. ID. No.: 96. 

Figure 4 9 shows nucleic acid SEQ. ID. No.: 97 and 
amino acid SEQ. ID. No.:98„ 

Figure 50 shows nucleic acid SEQ. ID. No.s99.and 
amino acid SEQ. ID. No.: 100. 

Figure 51 shows nucleic acid SEQ. ID. No.: 101 and 
amino acid SEQ. ID. No.: 102. 

Figure 52 shows nucleic acid SEQ. ID. No.:103 and 
amino acid SEQ. ID. No.: 104. 

Figure 53 shows nucleic acid SEQ. ID. No.: 105 and 
amino acid SEQ. ID. No.: 10 6. 

Figure 54 shows nucleic acid SEQ. ID. No.: 107 and 
amino acid SEQ. ID. No.: 108. 

Figure 55 shows nucleic acid SEQ. ID. No.: 109 and 
amino acid SEQ. ID. No.: 110. 

• Figure 56 shows nucleic acid SEQ. ID. No.: Ill and 
amino acid SEQ. ID. No.: 112. 

Figure 57 shows nucleic acid SEQ. ID. No.: 113 and 
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amino acid SEQ. ID. No.: 114 • 
Figure 58 shows nucleic 
amino acid SEQ. ID. No.: 11 6. 

Figure 59 shows nucleic 
amino acid SEQ. ID. No.: 118. 

Figure 60 shows nucleic 
amino acid SEQ. ID. No.: 120. 

Figure 61 shows nucleic 
amino acid SEQ. ID. No.: 122. 

Figure 62 shows nucleic 
amino acid SEQ. ID. No.: 124. 

Figure 63 shows nucleic 
amino acid SEQ. ID. No. s 12 6. 

Figure 64 shows nucleic 
amino acid SEQ. ID. No.: 128. 

Figure 65 shows nucleic 
amino acid SEQ. ID. No.: 130. 

Figure 66 shows nucleic 
amino acid SEQ. ID. No.: 132. 

Figure 67 shows nucleic 
amino acid SEQ. ID. No.: 134. 

Figure 68 shows nucleic 
amino acid SEQ. ID. No.: 13 6. 

Figure 69 shows nucleic 
amino acid SEQ. ID. No.: 138. 

Figure 7 0 shows nucleic 
amino acid SEQ. ID. No.: 14 0. 

Figure 72 shows nucleic 
amino acid SEQ. ID. No.: 144. 

Figure 73 shows nucleic 



acid SEQ. ID. No.: 115 and 
acid SEQ. ID. No.: 117 and 
acid SEQ. ID. No.: 119 and 
acid SEQ. ID. No.: 121 and 
acid SEQ. ID. No.: 123 and 
acid SEQ. ID. No.:125 and 
acid SEQ. ID. No.:127 and 
acid SEQ. ID. No.: 129 and 
acid SEQ. ID. No.: 131 and 
acid SEQ. ID. No.: 133 and 
acid SEQ. ID. No.: 135 and 
acid SEQ. ID. No.: 137 and 
acid SEQ. ID. No.: 139 and 
acid SEQ. ID. No.: 143 and 
acid SEQ. ID. No.: 145 and 
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amino acid SEQ. ID. No.:146. 

Figure 74 shows nucleic acid SEQ. ID. No.:147 and 
amino acid SEQ. ID. No.: 148. 

Figure 75 shows nucleic acid SEQ. ID No.: 149 and 
amino acid SEQ. ID. No.: 150. 

Figure 7 6 shows nucleic acid SEQ. ID No.: 151 and 
amino acid SEQ. ID. No.: 152. 

Figure 77 shows nucleic acid SEQ. ID No.: 153 and 
amino acid SEQ. ID. No.: 154. 

Figure 78 shows nucleic acid SEQ. ID No.: 155 and 
amino acid SEQ. ID. No.: 156. 

Figure 79 shows nucleic acid SEQ. ID No.: 157 and 
amino acid SEQ. ID. No.: 158. 

Figure 80 shows nucleic acid SEQ. ID No.: 159 and 
amino acid SEQ. ID. No.: 160. 

Figure 81 shows nucleic acid SEQ. ID No.: 161 and 
amino acid SEQ. ID. No.: 162. 

Figure 82 shows nucleic acid SEQ. ID No . : 163 and 
amino acid SEQ. ID. No.: 164. 

Figure 83 shows nucleic acid SEQ. ID No.: 165 and 
amino acid SEQ. ID. No.: 166. 

Figure 84 shows nucleic acid SEQ. ID No.: 167 and 
amino acid SEQ. ID. No.: 168. 

Figure 85 shows nucleic acid SEQ. ID No.: 169 and 
amino acid SEQ. ID. No.: 17 0. 

Figure 86 shows nucleic acid SEQ. ID No.: 171 and 
amino acid SEQ. ID. No.: 172. 

Figure 87 shows nucleic acid SEQ. ID No.: 173 and 
amino acid SEQ. ID. No.: 174. 
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Figure 88 shows nucleic acid SEQ. ID No.: 175 and 
amino acid SEQ. ID. No.: 176. 

Figure 89 shows nucleic acid SEQ. ID No.: 177 and 
amino acid SEQ. ID. No.: 17 8. 

Figure 90 shows nucleic acid SEQ. ID No.: 179 and 
amino acid SEQ. ID. No.: 180. 

Figure 91 shows nucleic acid SEQ. ID No.: 181 and 
amino acid SEQ. ID. No.: 182. 

Figure 92 shows nucleic acid SEQ. ID No.: 183 and 
amino acid SEQ. ID. No.: 184. 

Figure 93 shows nucleic acid SEQ. ID No.: 185 and 
amino acid SEQ. ID. No.: 186. 

Figure 94 shows nucleic acid SEQ. ID No.: 187 and 
amino acid SEQ. ID. No.: 188. 

Figure 95 shows nucleic acid SEQ. ID No.: 189 and 
amino acid SEQ. ID. No.: 190. 

Figure 96 shows nucleic acid SEQ. ID No.: 191 and 
amino acid SEQ. ID. No.: 192. 

Figure 97 shows nucleic acid SEQ. ID No.: 193 and 
amino acid SEQ. ID. No.: 194. 

Figure 98 shows nucleic acid SEQ. ID No.: 195 and 
amino acid SEQ. ID. No.: 196. 

Figure 99 shows nucleic acid SEQ. ID No.: 197 and 
amino acid SEQ. ID. No.: 198. 

Figure 100 shows nucleic acid SEQ. ID No.: 199 and 
amino acid SEQ. ID. No.: 200. 

Figure 101 shows nucleic acid SEQ. ID No.: 201 and 
amino acid SEQ. ID. No.: 202. 

Figure 102 shows nucleic acid SEQ. ID No.: 203 and 
amino acid SEQ. ID. No.: 204. 
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Figure 103 shows nucleic 
amino acid SEQ. ID. No.: 206. 

Figure 104 shows nucleic 
amino acid SEQ. ID. No.: 208. 

Figure 105 shows nucleic 
amino acid SEQ. ID. No.: 210. 

Figure 106 shows nucleic 
amino acid SEQ. ID. No.: 212. 

Figure 107 shows nucleic 
amino acid SEQ. ID. No.: 214. 

Figure 108 shows nucleic 
amino acid SEQ. ID. No.: 216. 

Figure 109 shows nucleic 
amino acid SEQ. ID. No.: 218. 

Figure 110 shows nucleic 
amino acid SEQ. ID. No.: 220. 

Figure 111 shows nucleic 
amino acid SEQ. ID. No.: 222. 

Figure 112 shows nucleic 
amino acid SEQ. ID. No*: 224. 

Figure 113 shows nucleic 
amino acid SEQ. ID. No.: 226 n 
Figure 114 shows nucleic 
amino acid SEQ. ID. No.: 228. 

Figure 115 shows nucleic 
amino acid SEQ. ID. No.: 230. 

Figure 116 shows nucleic 
amino acid SEQ. ID. No.: 232c 
Figure 117 shows nucleic 
^ino acid SEQ. ID. No.: 234. 



acid SEQ. ID No.: 2 05 and 
acid SEQ. ID No.: 207 and 
acid SEQ. ID No.: 209 and 
acid SEQ. ID No.: 211 and 
acid SEQ. ID No.: 213 and 
acid SEQ. ID No.: 215 and 
acid SEQ. ID No.: 217 and 
acid SEQ. ID No.: 219 and 
acid SEQ. ID No.: 221 and 
acid SEQ. ID No.: 223 and 
acid SEQ. ID No.: 225 and 
acid SEQ. ID No.: 227 and 
acid SEQ. ID No.: 229 and 
acid SEQ. ID No.: 231 and 
acid SEQ. ID No.: 233 and 
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Figure 118 shows nucleic acid SEQ. ID No.: 235 and 
amino acid SEQ. ID. No.: 236. 

Figure 119 shows nucleic acid SEQ. ID No.: 237 and 
amino acid SEQ. ID. No.: 238. 

Figure 120 shows nucleic acid SEQ. ID No.: 239 and 
amino acid SEQ. ID. No.: 240. 

Figure 121 shows nucleic acid SEQ. ID No.: 241 and 
amino acid SEQ. ID. No.: 242. 

Figure 122 shows nucleic acid SEQ. ID No.: 243 and 
amino acid SEQ. ID. No.: 244. 

Figure 123 shows nucleic acid SEQ. ID No.: 245 and 
amino acid SEQ. ID. No.: 246. 

Figure 124 shows nucleic acid SEQ. ID No.: 247 and 
amino acid SEQ. ID. No.: 248. 

Figure 125 shows nucleic acid SEQ. ID No.: 249 and 
amino acid SEQ. ID. No.: 250. 

Figure 126 shows nucleic acid SEQ. ID No.: 251 and 
amino acid SEQ. ID. No.: 252. 

Figure 127 shows nucleic acid SEQ. ID No.: 253 and 
amino acid SEQ. ID. No.: 254. 

Figure 128 shows nucleic acid SEQ. ID No.: 255 and 
amino acid SEQ. ID. No.: 256. 

Figure 129 shows nucleic acid SEQ. ID No.: 257 and 
amino acid SEQ. ID. No.: 258. 

Figure 130 shows nucleic acid SEQ. ID No.: 259 and 
amino acid SEQ. ID. No.: 260. 

Figure 131 shows nucleic acid SEQ. ID No.: 261 and 
amino acid SEQ. ID. No.: 262. 

Figure 132 shows nucleic acid SEQ. ID No.: 263 and 
amino acid SEQ. ID. No.: 264. 



20 



WO 2004/035745 



PCT/US2003/032722 



Figure 133 shows nucleic acid SEQ. ID No.: 265 and 
amino acid SEQ. ID. No.: 266. 

Figure 134 shows nucleic acid SEQ. ID No.: 267 and 
amino acid SEQ. ID. No.: 268. 

Figure 135 shows nucleic acid SEQ. ID No.: 269 and 
amino acid SEQ. ID. No.: 270. 

Figure 136 shows nucleic acid SEQ. ID No.: 271 and 
amino acid SEQ. ID. No.: 272. 

Figure 137 shows nucleic acid SEQ. ID No.: 273 and 
amino acid SEQ. ID. No.: 274. 

Figure 138 shows nucleic acid SEQ. ID No.: 275 and 
amino acid SEQ. ID. No.: 276* 

Figure 139 shows nucleic acid SEQ. ID No.: 277 and 
amino acid SEQ. ID. No.: 278. 

Figure 140 shows nucleic acid SEQ. ID No.: 279 and 
amino acid SEQ. ID. No.: 280. 

Figure 141 shows nucleic acid SEQ. ID No. : 281 and 
amino acid SEQ. ID. No.: 282. 

Figure 142 shows nucleic acid SEQ. ID No.: 283 and 
amino acid SEQ. ID. No.: 284. 

Figure 143 shows nucleic acid SEQ. ID No.: 285 and 
amino acid SEQ. ID. No.: 28 6. 

Figure 144 shows nucleic acid SEQ. ID No.: 287 and 
amino acid SEQ. ID. No.: 288. 

Figure 145 shows nucleic acid SEQ. ID No.: 289 and 
amino acid SEQ. ID. No.: 290. 

Figure 146 shows nucleic acid SEQ. ID No.: 291 and 
amino acid SEQ. ID. No.: 292. 

Figure 147 shows nucleic acid SEQ. ID No.: 293 and 
amino acid SEQ. ID. No.: 294. 
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Figure 148 shows nucleic acid SEQ. ID No.: 295 and 
amino acid SEQ. ID. No.: 296. 

Figure 149 shows nucleic acid SEQ. ID No.: 297 and 
amino acid SEQ. ID. No.: 298. 

Figure 151 shows a comparison of Sequence Groups. 

Figure 152 illustrates alignment of full length 
clones. 

Figure 153 shows a procedure used for cloning of 
cytochrome p450 cDNA fragments by PCR 

DETAILED DESCRIPTION 

DEFINITIONS 

Unless defined otherwise, all technical and 
scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art 
to which this invention belongs. Singleton et al . (1994) 
Dictionary of Microbiology and Molecular Biology, second 
edition, John Wiley and Sons (New York) provides one of 
skill with a general dictionary of many of the terms used 
in this invention. All patents and publications referred 
to herein are incorporated by reference herein. For 
purposes of the present invention, the following terms 
are defined below. 

"Enzymatic activity" is meant to include 
demethylation, hydroxylation, epoxidation, N-oxidation, 
sulfoxidation, N-, S-, and O- dealkylations, 
desulfation, deamination, and reduction of azo, nitro, 
and N-oxide groups. The term "nucleic acid" refers to a 
deoxyribonucleotide or ribonucleotide polymer in either 
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single- or double-stranded form, or sense or anti-sense, 
and unless otherwise limited, encompasses known analogues 
of natural nucleotides that hybridize to nucleic acids in 
a manner similar to naturally occurring nucleotides . 
Unless otherwise indicated, a particular nucleic acid 
sequence includes the complementary sequence thereof- 
The terms "operably linked", "in operable 
combination", and "in operable order" refer to functional 
linkage between a nucleic acid expression control 
sequence (such as a promoter, signal sequence, or array 
of transcription factor binding sites) and a second 
nucleic acid sequence, wherein the expression control 
sequence affects transcription and/or translation of the 
nucleic acid corresponding to the second sequence. 

The term "recombinant" when used with reference to 
a cell indicates that the cell replicates a heterologous 
nucleic acid, expresses said nucleic acid or expresses a 
peptide, heterologous peptide, or protein encoded by a 
heterologous nucleic acid. Recombinant cells can express 
genes or gene fragments in either the sense or antisense 
form that are not found within the native (non- 
recombinant) form of the cello Recombinant cells can also 
express genes that are found in the native form of the 
cell, but wherein the genes are modified and re- 
introduced into the cell by artificial means, 

A "structural gene" is that portion of a gene 
comprising a DNA segment encoding a protein, polypeptide 
or a portion thereof, and excluding the 5* sequence which 
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drives the initiation of transcription. The structural 
gene may alternatively encode a nontranslatable product. 
The structural gene may be one which is normally found in 
the cell or one which is not normally found in the cell 
or cellular location wherein it is introduced, in which 
case it is termed a "heterologous gene". A heterologous 
gene may be derived in whole or in part from any source 
known to the art, including a bacterial genome or 
episome, eukaryotic, nuclear or plasmid DNA, cDNA, viral 
DNA or chemically synthesized DNA. A structural gene may 
contain one or more modifications that could effect 
biological activity or its characteristics, the 
biological activity or the chemical structure of the 
expression product, the rate of expression or the manner 
of expression control. Such modifications include, but 
are not limited to, mutations, insertions, deletions and 
substitutions of one or more nucleotides. The structural 
gene may constitute an uninterrupted coding sequence or 
it may include one or more introns, bounded by the 
appropriate splice junctions. The structural gene may be 
translatable or non-translatable, including in an anti- 
sense orientation. The structural gene may be a 
composite of segments derived from a plurality of sources 
and from a plurality of gene sequences (naturally 
occurring or synthetic, where synthetic refers to DNA 
that is chemically synthesized) . 

"Derived from" is used to mean taken, obtained, 
received, traced, replicated or descended from a source 
(chemical and/or biological) . A derivative may be 
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produced by chemical or biological manipulation 
(including, but not limited to, substitution, addition, 
insertion, deletion, extraction, isolation, mutation and 
replication) of the original source. 

"Chemically synthesized", as related to a sequence 
of DNA, means that portions of the component nucleotides 
were assembled in vitro. Manual chemical synthesis of 
DNA may be accomplished using well established procedures 

(Caruthers, Methodology of DNA and RNA Sequencing, 

(1983), Weissman (ed.), Praeger Publishers, New York, 
Chapter 1) ; automated chemical synthesis can be performed 
using one of a number of commercially available machines. 

Optimal alignment of sequences for comparison may be 
conducted by the local homology algorithm of Smith and 
Waterman, Adv. Appl. Math. 2:482 (1981), by the homology 
alignment algorithm of' Needleman and Wunsch, J. Mol. 
Biol. 48:443 (1970), by the search for similarity method 
of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 
2444 (1988), by computerized implementations of these 
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package, Genetics Computer 
Group, 575 Science Dr., Madison, Wis.), or by inspection. 

The NCBI Basic Local Alignment Search Tool (BLAST) 
(Altschul et al., 1990) is available from several 
sources, including the National Center for Biological 
Information (NCBI, Bethesda, Md.) and on the Internet, 
for use in connection with the sequence analysis programs 



25 



WO 2004/035745 



PCT/US2003/032722 



blastp, blastn, blastx, tblastn and tblastx. It can be 
accessed at htp: / /www. neb i •nlm.nih. gov/ BLAST/ . A 
description of how to determine sequence identity using 
this program is available at 

http: / /www. ncbi .nlm.nih.gov/BLAST/blast help.html . 

The terms "substantial amino acid identity" or 
"substantial amino acid sequence identity" as applied to 
amino acid sequences and as used herein denote a 
characteristic of a polypeptide, wherein the peptide 
comprises a sequence that has at least 7 0 percent 
sequence identity, preferably 8 0 percent amino acid 
sequence identity, more preferably 90 percent amino acid 
sequence identity, and most preferably at least 99 to 100 
percent sequence identity as compared to a reference 
group over region corresponding to the first amino acid 
following the cytochrome p450 motif GXRXCX (G/A) to the 
stop codon of the translated peptide. 

The terms "substantial nucleic acid identity" or 
"substantial nucleic acid sequence identity" as applied 
to nucleic acid sequences and as used herein denote a 
characteristic of a polynucleotide sequence, wherein the 
polynucleotide comprises a sequence that has at least 75 
percent sequence identity, preferably 81 percent amino 
acid sequence identity, more preferably at least 91 
percent sequence identity, and most preferably at least 
99 to 100 percent sequence identity as compared to a 
reference group over region corresponding to the first 
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nucleic acid following the cytochrome p450 motif 
GXRXCX (G/A) to the stop codon of the translated peptide. 

Another indication that nucleotide sequences are 
substantially identical is if two molecules hybridize to 
each other under stringent conditions- Stringent 
conditions are sequence-dependent and will be different 
in different circumstances. Generally, stringent 
conditions are selected to be about 5 # C to about 20 
usually about 10 'C to about 15 # C, lower than the thermal 
melting point (Tm) for the specific sequence at a defined 
ionic strength and pH. The Tm is the temperature (under 
defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a matched probe. Typically, 
stringent conditions will be those in which the salt 
concentration is about "0.02 molar at pH 7 and the 
temperature is at least about 60 # C. For instance in a 
standard Southern hybridization procedure, stringent 
conditions will include an initial wash in 6xSSC at 42 # C 
followed by one or more additional washes in 0.2xSSC at 
a temperature of at least, about 55 typically about 
60 # C and often about 65* Co 

Nucleotide sequences are also substantially 
identical for purposes of this invention when the 
polypeptides and/or proteins which they encode are 
substantially identical. Thus, where one nucleic acid 
sequence encodes essentially the same polypeptide as a 
second nucleic acid sequence, the two nucleic acid 
se q Uence s are substantially identical, even if they would 
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not hybridize under stringent conditions due to 
degeneracy permitted by the genetic code (see, Darnell et 
al. (1990) Molecular Cell Biology, Second Edition 
Scientific American Books W. H. Freeman and Company New 
York for an explanation of codon degeneracy and the 
genetic code) . Protein purity or homogeneity can be 

indicated by a number of means well known in the art, 
such as polyacrylamide gel electrophoresis of a protein 
sample, followed by visualization upon staining. For 
certain purposes high resolution may be needed and HPLC 
or a similar means for purification may be utilized. 

As used herein, the term "vector" is used in 
reference to nucleic acid molecules that transfer DNA 
segment (s) into a cell. A vector may act to replicate 
DNA and may reproduce independently in a host cell. The 
term "vehicle" is sometimes used interchangeably with 
"vector." The term "expression vector" as used herein 
refers to a recombinant DNA molecule containing a desired 
coding sequence and appropriate nucleic acid sequences 
necessary for the expression of the operably linked 
coding sequence in a particular host organism. Nucleic 
acid sequences necessary for expression in prokaryotes 
usually include a promoter, an operator (optional), and 
a ribosome binding site, often along with other 
sequences. Eucaryotic cells are known to utilize 
promoters, enhancers, and termination and polyadenylat ion 
signals . 



28 



WO 2004/035745 



PCT/US2003/032722 



For the purpose of regenerating complete genetically 
engineered plants with roots, a nucleic acid may be 
inserted into plant cells, for example, by any technique 
such as in vivo inoculation or by any of the known in 
vitro tissue culture techniques to produce transformed 
plant cells that can be regenerated into complete plants. 
Thus, for example, the insertion into plant cells may be 
by in vitro inoculation by pathogenic or non-pathogenic 
A. tumefaciens. Other such tissue culture techniques may 
also be employed. 

"Plant tissue'' includes differentiated and 
undifferentiated tissues of plants, including, but not 
limited to, roots, shoots, leaves, pollen, seeds, tumor 
tissue and various forms of cells in culture, such as 
single cells, protoplasts, embryos and callus tissue . The 
plant tissue may be in planta or in organ, tissue or cell 
culture • 

"Plant cell" as used herein includes plant cells in 
planta and plant cells and protoplasts in culture. 

"cDNA" or "complementary DNA" generally refers to a 
single stranded DNA molecule with a nucleotide sequence 
that is complementary to an RNA molecule. cDNA is formed 
by the action of the enzyme reverse transcriptase on an 
RNA template. 

STRATEGIES FOR OBTAINING NUCLEIC ACID SEQUENCES 
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In accordance with the present invention, RNA was 
extracted from Nicotiana tissue of converter and non- 
converter Nicotiana lines. The extracted RNA was then 
used to create cDNA. Nucleic acid sequences of the 
present invention were then generated using two 
strategies . 

In the first strategy, the poly A enriched RNA was 
extracted from- plant tissue and cDNA was made by reverse 
transcription PGR. The single strand cDNA was then used 
to create p450 specific PGR populations using degenerate 
primers plus a oligo d(T) reverse primer . The primer 
design was based on the highly conserved motifs of p450. 
Examples of specific degenerate primers are set forth in 
Figure 1. Sequence fragments from plasmids containing 
appropriate size inserts were further analyzed. These 
size inserts typically ranged from about 300 to about 800 
nucleotides depending on which primers were used. 

In a second strategy, a cDNA library was initially 
constructed. The cDNA in the plasmids was used to create 
p450 specific PGR populations using degenerate primers 
plus T7 primer on plasmid as reverse primer. As in the 
first strategy, sequence fragments from plasmids 
containing appropriate size inserts were further 
analyzed, 

Nicotiana plant lines known to produce high levels 
of nornicotine (converter) and plant lines having 
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undetectable levels of nornicotine may be used as 
starting materials. 

Leaves can then be removed from plants and treated 
with ethylene to activate p450 enzymatic activities 
defined herein. Total RNA is extracted using techniques 
known in the art. cDNA fragments can then be generated 
using PGR (RT-PCR) with the oligo d(T) primer as 
described in Figure 153. The cDNA library can then be 
constructed more fully described in examples herein. 

The conserved region of p450 type enzymes can be 
used as a template for degenerate primers (Figure 75) . 
Using degenerate primers, p450 specific bands can be 
amplified by PCR. Bands indicative for p450 like enzymes 
can be identified by DNA sequencing. PCR fragments can 
be characterized using BLAST search, alignment or other 
tools to identify appropriate candidates. 

Sequence information from identified fragments can 
be used to develop PCR primers. These primers in 
combination of plasmid primers in cDNA library were used 
to clone full length p450 genes. Large-scale Southern 
reverse analysis was conducted to examine the 
differential expression for all fragment clones obtained 
and in some cases full length clones. In this aspect of 
the invention, these large-scale reverse Southern assays 
can be conducted using labeled total cDNA' s from 
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different tissues as a probe to hybridize with cloned DNA 
fragments in order to screen all cloned inserts. 

Nonradioactive and radioactive (P 32 ) Northern 
blotting assays were also used to characterize clones 
p450 fragments and full length clones. 

Peptide specific antibodies were made against 
several full-length clones by deriving their amino acid 
sequence and selecting peptide regions that were 
antigenic and unique relative to other clones. Rabbit 
antibodies were made to synthetic peptides conjugated to 
a carrier protein. Western blotting analyses or other 
immunological methods were performed on plant tissue 
using these antibodies* 

Nucleic acid sequences identified as described above 
can be examined by using virus induced gene silencing 
technology (VIGS, Baulcombe, Current Opinions in Plant 
Biology, 1999, 2:109-113) . 

Peptide specific antibodies were made for several 
full-length clones by deriving their amino acid sequence 
and selecting peptide regions that were potentially 
antigenic and were unique relative to other clones. 
Rabbit antibodies were made to synthetic petides 
conjugated to a carrier protein. Western blotting 
analyses were perfomed using these antibodies. 
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In another aspect of the invention, interfering RNA 
technology (RNAi) is used to further characterize 
cytochrome p450 enzymatic activities in Nicotiana plants 
of the present invention. The following references which 
describe this technology are incorporated by reference 
herein, Smith et al . , Nature, 2000, 407:319-320; Fire et 
al., Nature, 1998, 391:306- 311; Waterhouse et al . , PNAS, 
1998, 95:13959-13964; Stalberg et al . , Plant Molecular 
Biology, 1993, 23:671- 683; Baulcombe, Current Opinions 
in Plant Biology, 1999, 2:109-113; and Brigneti et al . , 
EMBO Journal, 1998, 17 (22) : 6739-6746. Plants may be 
transformed using RNAi technigues, antisense technigues, 
or a variety of other methods described. 

Several techni q ues exist for introducing foreign 
genetic material into plant cells, and for obtaining 
plants that stably maintain and express the introduced 
gene. Such technigues include acceleration of genetic 
material coated onto microparticles directly into cells 
(US Patents 4,945,050 to Cornell and 5,141,131 to 
DowElanco) . Plants may be transformed using 

Agrobacterium technology, see US Patent 5,177,010 to 
University of Toledo, 5,104,310 to Texas A&M, European 
Patent Application 0131624B1, European Patent 
Applications 120516, 159418B1, European Patent 
Applications 120516, 159418B1 and 176,112 to Schilperoot, 
US Patents 5,149,645, 5,469,976, 5,464,763 and 4,940,838 
and 4,693,976 to Schilperoot, European Patent 
Applications 116718, 290799, 320500 all to MaxPlanck, 
European Patent Applications 604 662 and 627752 to Japan 
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Nicotiana, European Patent Applications 0267159, and 
0292435 and US Patent 5,231,019 all to Ciba Geigy, US 
Patents 5,463,174 and 4,762,785 both to Calgene, and US 
Patents 5,004,863 and 5,159,135 both to Agracetus. Other 
transformation technology includes whiskers technology, 
see U.S. Patents 5,302,523 and 5,464,765 both to Zeneca. 
Electroporation technology has also been used to 
transform plants, see WO 87/06614 to Boyce Thompson 
Institute, 5,472,869 and 5,384,253 both to Dekalb, 
WO9209696 and W09321335 both to PGS . All of these 
transformation patents and publications are incorporated 
by reference. In addition to numerous technologies for 
transforming plants, the type of tissue which is 
contacted with the foreign genes may vary as well. Such 
tissue would include but would not be limited to 
embryogenic tissue, callus tissue type I and II, 
hypocotyl, meristem, and the like. Almost all plant 
tissues may be transformed during dedif f erentiation using 
appropriate techniques within the skill of an artisan. 

Foreign genetic material introduced into a plant may 
include a selectable marker. The preference for a 
particular marker is at the discretion of the artisan, 
but any of the following selectable markers may be used 
along with any other gene not listed herein which could 
function as a selectable marker. Such selectable markers 
include but are not limited to aminoglycoside 
phosphotransferase gene of transposon Tn5 (Aph II) which 
encodes resistance to the antibiotics kanamycin, neomycin 
and G418, as well as those genes which code for 
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resistance or tolerance to glyphosate; hygromycin; 
methotrexate; phosphinothricin (bar) ; imidazolinones, 
sulfonylureas and triazolopyrimidine herbicides, such as 
chlorosulfuron; bromoxynil, dalapon and the like. 

In addition to a selectable marker, it may be 
desirous to use a reporter gene. In some instances a 
reporter gene may be used without a selectable marker. 
Reporter genes are genes which are typically not present 
or expressed in the recipient organism or tissue. The 
reporter gene typically encodes for a protein which 
provide for some phenotypic change or enzymatic property. 
Examples of such genes are provided in K. Weising et al . 
Ann. Rev. Genetics, 22, 421 (1988), which is incorporated 
herein by reference. Preferred reporter genes include 
without limitation glucuronidase (GUS) gene and GFP 
genes . 

Once introduced into the plant tissue, the 
expression of the structural gene may be assayed by any 
means known to the art, and expression may be measured as 
mRNA transcribed, protein synthesized, or the amount of 
gene silencing that occurs (see U.S. Patent No. 
5,583,021 which is hereby incorporated by reference). 
Techniques are known for the in vitro culture of plant 
tissue, and in a number of cases, for regeneration into 
whole plants (EPApplnNo. 88810309.0). Procedures for 
transferring the introduced expression complex to 
commercially useful cultivars are known to those skilled 
in the art. 



35 



WO 2004/035745 



PCT/US2003/032722 



Once plant cells expressing the desired level of 
p450 enzyme are obtained, plant tissues and whole plants 
can be regenerated therefrom using methods and techniques 
well-known in the art. The regenerated plants are then 
reproduced by conventional means and the introduced genes 
can be transferred to other strains and cultivars by 
conventional plant breeding techniques. 

The following examples illustrate methods for 
carrying out the invention and should be understood to be 
illustrative of, but not limiting upon, the scope of the 
invention which is defined in the appended claims. 

EXAMPLES 

EXAMPLE I: DEVELOPMENT OF PLANT TISSUE AND ETHYLENE 
TREATMENT 

Plant Growth 

Plants were seeded in pots and grown in a greenhouse 
for 4 weeks. The 4 week old seedlings were transplanted 
into individual pots and grown in the greenhouse for 2 
months. The plants were watered 2 times a day with water 
containing ISOppm NPK fertilizer during growth. The 
expanded green leaves were detached from plants to do the 
ethylene treatment described below. 

Cell Line 7 8379 
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Tobacco line 78379, which is a burley tobacco line 
.released by the University of Kentucky was used as a 
source of plant material* One hundred plants were 
cultured as standard in the art of growing tobacco and 
transplanted and tagged with a distinctive number (1- 
100) . Fertilization and field management were conducted 
as recommended. 

Three quarters of the 100 plants converted between 
20 and 100% of the nicotine to nomicotine. One quarter 
of the 100 plants converted less than 5% of the nicotine 
to nomicotine. Plant number 87 had the least conversion 
(2%) while plant number 21 had 100% conversion. Plants 
converting less than 3% were classified as non- 
converters. Self-pollinated seed of plant number 87 and 
plant number 21, as well as crossed (21 x 87 and 87 x 21) 
seeds were made to study genetic and phenotypic 
differences. Plants from selfed 21 were converters, and 
99% of selfs from 87 were non-converters. The other 1% 
of the plants from 87 showed low conversion (5-15%) . 
Plants from reciprocal crosses were all converters. 
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Cell Line 4407 

Nicotiana line 44Q7, which is a burley line was used 
as a source of plant material. Uniform and 

representative plants (100) were selected and tagged. Of 
the 100 plants 97 were non-converters and three were 
converters. Plant number 56 had the least amount of 
conversion (1.2%) and plant number 58 had the highest 
level of conversion (96%) . Self-pollenated seeds and 
crossed seeds were made with these two plants. 

Plants from selfed-58 segregated with 3:1 converter 
to non-converter ratio. Plants 58-33 and 58-25, were 
identified as homozygous converter and nonconverter plant 
lines, respectively. The stable conversion of 58-33 was 
confirmed by analysis of its progenies of next 
generation. 

Cell L.i-ne PBLB01 

PBLB01 is a burley line developed by ProfiGen, Inc. 
and was used as a source of plant material. The 
converter plant was selected from foundation seeds of 
PBLB01 . 

Ethylene Treatment Procedures 

Green leaves were detached from 2-3 month greenhouse 
grown plants and sprayed with 0.3% ethylene solution 
(Prep brand Ethephon (Rhone-Poulenc) ) . Each sprayed leaf 
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was hung in a curing rack equipped with humidifier and 
covered with plastic. During the treatment, the sample 
leaves were periodically sprayed with the ethylene 
solution. Approximately 24-48 hour post ethylene 
treatment, leaves were collected for RNA extraction. 
Another sub-sample was taken for metabolic constituent 
analysis to determine the concentration of leaf 
metabolites and more specific constituents of interest 
such as a variety of alkaloids. 

As an example, alkaloids analysis could be performed 
as follows. Samples (0.1 g) were shaken at 150 rpm with 
0.5 ml 2N NaOH, and a 5 ml extraction solution which 
contained quinoline as an internal standard and methyl t- 
butyl ether. Samples were analyzed on a HP 68 90 GC 
equipped with a FID detector. A temperature of 250 # C was 
used for the detector and injector. An HP column (30m- 

0. 32nm-l-m) consisting of fused silica crosslinked with 
5% phenol and 95% methyl silicon was used at a 
temperature gradient of 110-185 # C at 10 # C per minute. 
The column was operated at 100 "C with a flow rate of 

1. Vcm^in" 1 with a split ratio of 40:1 with a 2-1 
injection volume using helium as the carrier gas. 

EXAMPLE 2: RNA ISOLATION 

For RNA extractions, middle leaves from 2 month old 
greenhouse grown plants were treated with ethylene as 
described. The 0 and 24-48 hours samples were used for 
RNA extraction. In some cases, leaf samples under the 
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senescence process were taken from the plants 10 days 
post flower-head removal. These samples were also used 
for extraction. Total RNA was isolated using Rneasy Plant 
Mini Kit® (Qiagen, Inc., Valencia, California) following 
manufacturer' s protocol . 

The tissue sample was ground under liquid nitrogen 
to a fine powder using a DEPC treated mortar and pestle. 
Approximately 100 milligrams of ground tissue were 
transferred to a sterile 1 . 5 ml eppendorf tube. This 
sample tube was placed in liquid nitrogen until all 
samples were collected. Then, 450ja-l of Buffer RLT as 
provided in the kit (with the addition of 
Mercaptoethanol) was added to each individual tube. The 
sample was vortexed vigorously and incubated at 5 6° C for 
3 minutes. The lysate was then, applied to the 
QIAshredder™ spin column sitting in a 2 -ml collection 
tube, and centrifuged for 2 minutes at maximum speed. 
The flow through was collected and 0.5 volume of ethanol 
was added to the cleared lysate. The sample is mixed 
well and transferred to an Rneasy® mini spin column 
sitting in a 2 ml collection tube. The sample was 
centrifuged for 1 minute at 10, 000rpm. Next, 700|al of 
buffer RW1 was pipetted onto the Rneasy® column and 
centrifuged for 1 minute at 10,000rpm. Buffer RPE was 
pipetted onto the Rneasy® column in a new collection 
tube and centrifuged for 1 minute at 10,000 rpm. Buffer 
RPE was again, added to the Rneasy® spin column and 
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centrifuged for 2 minutes at maximum speed to dry the 
membrane. To eliminate any ethanol carry over, the 
membrane was placed in a separate collection tube and 
centrifuged for an additional 1 minute at maximum speed. 
The Rneasy® column was transferred into a new 1.5 ml 
collection tube, and 40 |0,1 of Rnase-free water was 
pipetted directly onto the Rneasy® membrane. This final 
elute tube was centrifuged for 1 minute at 10, 000rpm. 
Quality and quantity of total RNA was analyzed by 
denatured formaldehyde gel and spectrophotometer . 

Poly (A) RNA was isolated using Oligotex™ poly A+ RNA 
purification kit (Qiagen Inc.) following manufacture's 
protocol. About 200 jig total RNA in 250 \xl maximum 
volume was used. A volume of 250ul of Buffer OBB and 15 
Vi 1 of Oligotex™ suspension was added to the 250 y.1 of 
total RNA. The contents were mixed thoroughly by 
pipetting and incubated for 3 minutes at 70 'C on a 
heating block. The sample was then, placed at room 
temperature for approximately 2 0 minutes. The 
oligotex :mRNA complex was pelleted by centrifugation for 
2 minutes at maximum speed. All but 50 ul of the 
supernatant was removed from the microcentrifuge tube. 
The sample was treated further by OBB buffer . The 
oligotex:mRNA pellet was resuspended in 400 ul of Buffer 
OW2 by vortexing. This mix was transferred onto a small 
spin column placed in a new tube and centrifuged for 1 
minute at maximum speed. The spin column was transferred 
to a new tube and an additional 4 00 ul of Buffer OW2 was 
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added to the column. The tube was then centrifuged for 

1 minute at maximum speed. The spin column was 
transferred to a final 1.5ml microcentrifuge tube. The 
sample was eluted with 60 ul of hot (70°C) Buffer OEB. 
Poly A product was analyzed by denatured formaldehyde 
gels and spectrophotometric analysis. 

EXAMPLE 3: REVERSE TRANSCRIPTION- PGR 

First strand cDNA was produced using Superscript 
reverse transcriptase following manufacturer' s protocol 
(Invitrogen, Carlsbad, California) . The poly A+ 
enriched RNA/oligo dT primer mix consisted of less than 
5 ug of total RNA, 1 p.1 of lOmM dNTP mix, 1 ul of Oligo 
d (T) 12 _ 18 (0.5ug/ul), and up to 10 ul of DEPC-treated 
water. Each sample was incubated at 65 'C for 5 
minutes, then placed on ice for at least 1 minute. A 
reaction mixture was prepared by adding each of the 
following components in order: 2 ul 10X RT buffer, 4 
Ul of 25 raM MgC12, 2\xl of 0.1 M DTT, and 1 ul of RNase 
OUT Recombinant RNase Inhibitor. An addition of 9 ul 
of reaction mixture was pipetted to each RNA/primer 
mixture and gently mixed. It was incubated at 42 *C for 

2 minutes and 1 ul of Super Script II™ RT was added to 
each tube. The tube was incubated for 50 minutes at 
42 'C. The reaction was terminated at 70* C for 15 
minutes and chilled on ice. The sample was collected 
by centrifugation and 1 ul of RNase H was added to each 
tube and incubated for 20 minutes at 37 *C. The second 
PCR was carried out with 200 pmoles of forward primer 
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(degenerate primers as in Figure 75, SEQ.ID Nos. 149- 
156) and 100 pmoles reverse primer (mix of 18nt oligo 
d(T) followed by 1 random base) . 

Reaction conditions were 94 *C for 2 minutes and 
then performed 40 cycles of PCR at 94 *C for 1 minute, 
45* to 60 'C for 2 minutes, 72 *C for 3 minutes with a 
72 # C extension for an extra 10 min. 

Ten microliters of the amplified sample were 
analyzed by electrophoresis using a 1% agarose gel. The 
correct size fragments were purified from agarose gel. 



EXAMPLE 4: GENERATION OF PCR FRAGMENT POPULATIONS 

PCR fragments from Example 3 were ligated into a 
pGEM-T® Easy Vector (Promega, Madison, Wisconsin) 
following manufacturer' s instructions. The ligated 
product was transformed into 0M109 competent cells and 
plated on LB media plates for blue/white selection. 
Colonies were selected and grown in a 96 well plate 
with 1.2 ml of LB media overnight at 37 'C. Frozen 
stock was generated for all selected colonies. Plasmid 
DNA from plates were purified using Beckman' s Biomeck 
2000 miniprep robotics with Wizard SV Miniprep® kit 
(Promega) . Plasmid DNA was eluted with lOOjxlwater and 
stored in a 96 well plate. Plasmids were digested by 
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EcoRl and were analyzed using 1% agarose gel to confirm 
the DNA quantity and size of inserts. The plasmids 
containing a 400-600 bp insert were sequenced using an 
CEQ 2000 sequencer (Beckman, Fullerton, California) . 
The sequences were aligned with GenBank database by 
BLAST search- The p450 related fragments were 
identified and further analyzed. Alternatively, p450 
fragments were isolated from substraction libraries. 
These fragments were also analyzed as described above. 

EXAMPLE 5: CONSTRUCTION OF CDNA LIBRARY 

A cDNA library was constructed by preparing total 
RNA from ethylene treated leaves as follows. First, 
total RNA was extracted from ethylene treated leaves of 
tobacco line 58-33 using a modified acid phenol and 
chloroform extraction protocol. Protocol was modified 
to use one gram of tissue that was ground and 
subsequently vortexed in 5 ml of extraction buffer (100 
mM Tris-HCl, pH 8.5; 200 mM NaCl; 10mM EDTA; 0.5% SDS) 
to which 5 ml phenol (pH5.5) and 5 ml chloroform was 
added. The extracted sample was centrifuged and the 
supernatant was saved. This extraction step was 
repeated 2-3 more times until the supernatant appeared 
clear. Approximately 5 ml of chloroform was added to 
remove trace amounts of phenol, RNA was precipitated 
from the combined supernatant fractions by adding a 3- 
fold volume of ETOH and 1/10 volume of 3M NaOAc (pH5.2) 
and storing at -2 0°C for 1 hour. After transferring to 
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a Corex glass container the RNA fraction was 
centrifuged at 9,000 RPM for 45 minutes at 4°C. The 
pellet was washed with 7 0% ethanol and spun for 5 
minutes at 9,000 RPM at 4°C. After drying the pellet, 
the pelleted RNA was dissolved in 0.5 ml RNase free 
water. The pelleted RNA was dissolved in 0.5 ml RNase 
free water. The quality and quantity of total RNA was 
analyzed by denatured formaldehyde gel and 
spectrophotometer, respectively. 

The resultant total RNA was isolated for poly A+ 
RNA using an Oligo(dT) cellulose protocol (Invitrogen) 
and Microcentrifuge spin columns (Invitrogen) by the 
following protocol. Approximately twenty mg of total 
RNA was subjected to twice purification to obtain high 
quality poly A+ RNA. Poly A+ RNA product was analyzed 
by performing denatured formaldehyde gel and subsequent 
RT-PCR of known full-length genes to ensure high 
quality of mRNA. 

Next, poly A+ RNA was used as template to produce 
a cDNA library employing cDNA synthesis kit, ZAP-cDNA® 

synthesis kit, and ZAP-cDNA® Gigapack® III gold 
cloning kit (Stratagene, La Jolla, California) . The 
method involved following the manufacture's protocol as 
specified. Approximately 8 |ng of poly A+ RNA was used 
to construct cDNA library. Analysis of the primary 
library revealed about 2.5 x 10 6 - lx 10 7 pfu. A 
quality background test of the library was completed by 
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complementation assays using IPTG and X-gal, where 
recombinant plaques was expressed at more than 100-fold 
above the background reaction «, 

A more quantitative analysis of the library by 
random PGR showed that average size of insert cDNA was 
approximately 1.2 kb. The method used a two-step PCR 
method as followed. For the first step, reverse 
primers were designed based on the preliminary sequence 
information obtained from p450 fragments. The designed 
reverse primers and T3 (forward) primers were used 
amplify corresponding genes from the cDNA library. PCR 
reactions were subjected to agarose electrophoresis and 
the corresponding bands of high molecular weight were 
excised, purified, cloned and sequenced. In the second 
step, new primers designed from 5 1 UTR or the start 
coding region of p450 as the forward primers together 
with the reverse primers (designed from 3 'UTR of p450) 
were used in the subsequent PCR to obtain full-length 
p450 clones. 

The p450 fragments were generated by PCR 
amplification from the constructed cDNA library as 
described in Example 3 with the exception of the 
reverse primer. The T7 primer located on the plasmid 
downstream of cDNA inserts (see Figure 75) was used as 
a reverse primer. PCR fragments were isolated, cloned 
and sequenced as described in Example 4. 

Full-length p450 genes were isolated by PCR method 
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from constructed cDNA library. Gene specific reverse 
primers (designed from the downstream sequence of p450 
.fragments) and a forward primer (T3 on library plasmid) 
were used to clone the full length genes. PCR 
fragments were isolated, cloned and sequenced. If 
necessary, second step PCR was applied. In the second 
step, new forward primers designed from 5'UTR of cloned 
p450s together with the reverse primers designed from 
3 f UTR of p450 clones were used in the subsequent PCR 
reactions to obtain full-length p450 clones. The 
clones were subsequently sequenced. 

EXAMPLE 6: CHARACTERIZATION OF CLONED FRAGMENTS - 
REVERSE SOUTHERN BLOTTING ANALYSIS 

Nonradioactive large scale reverse southern 
blotting assays were performed on all p450 clones 
identified in above examples to detect the differential 
expression. It was observed that the level of 
expression among different p450 clusters was very 
different. Further real time detection was conducted 
on those with high expression. 

Nonradioactive Southern blotting procedures were 
conducted as follows. 

1) Total RNA was extracted from ethylene treated 
and nontreated converter (58-33) and nonconverter (58- 
25) leaves using the Qiagen Rnaeasy kit as described in 
Example 2. 
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2) Probe was produced by biotin-tail labeling a 
single strand cDNA derived from poly A+ enriched RNA 
generated in above step. This labeled single strand 
cDNA was generated by RT-PCR of the converter and 
nonconverter total RNA (Invitrogen) as described in 
Example 3 with the exception of using biotinalyted 
oligo dT as a primer (Promega) . These were used as a 
probe to hybridize with cloned DNA. 

3) Plasmid DNA was digested with restriction 
enzyme EcoRl and run on agarose gels. Gels were 
simultaneously dried and transferred to two nylon 
membranes (Biodyne B® ) . One membrane was hybridized 
with converter probe and the other with nonconverter 
probe. Membranes were UV-cross linked (auto crosslink 
setting, 254 nm, Stratagene, Stratalinker ) before 
hybridization . 

Alternatively, the inserts were PCR amplified from 
each plasmid using the sequences located on both arms 
of p-GEM plasmid, T3 and SP6, as primers. The PCR 
products were analyzed by running on a 96 well Ready- 
to-run agarose gels. The confirmed inserts were dotted 
on two nylon membranes. One membrane was hybridized 
with converter probe and the other with nonconverter 
probe. 

4) The membranes were hybridized and washed 
following manufacture's instruction, with the 
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modification of washing stringency (Enzo MaxSence™ kit, 
Enzo Diagnostics, Inc, Farmingdale, NY) . The membranes 
were prehybridized with hybridization buffer (2x SSC 
buffered formamide, containing detergent and 
hybridization enhancers) at 42°C for 30 min and 
hybridized with lOjil denatured probe overnight at 42°C. 
The membranes then were washed in IX hybridization wash 
buffer 1 time at room temperature for 10 min and 4 
times at 68°C for 15 min. The membranes were ready for 
the detection. 

5) The washed membranes were detected by alkaline 
phosphatase labeling followed by NBT/BCIP colometric 
detection as described in manufacture's detection 
procedure (Enzo Diagnostics, Inc.). The membranes were 
blocked for one hour at room temperature with lx 
blocking solution, washed £ times with IX detection 
reagents for 10 min, washed 2 times with lx 
predevelopment reaction buffer for 5 min and then 
developed the blots in developing solution for 30-45 
min until the dots appear. All reagents were provided 
by manufacture (Enzo Diagnostics, Inc) . In Addition, 
large scale reverse Southern assay was also performed 
using KPL southern hybridization and detection kit™ 
following manfacturer's instruction (KPL, Gaithersburg, 
Maryland) . 
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EXAMPLE 7: CHARACTERIZATION OF CLONES - NO RTHERN BLOT 
ANALYSIS 

Alternative to Southern Blot analysis, some 
membranes were hybridized and detected as described in 
the example of Northern blotting assays. Northern 
Hybridization was used to detect mRNA differentially 
expressed in Nicotiana as follows, 

A random priming method was used to prepare probes 
from cloned p450 (Megaprime™ DNA Labelling Systems, 
Amersham Biosciences) . 

The following components were mixed: 25ng 
denatured DNA template; 4ul of each unlabeled dTTP, 
dGTP and dCTP; 5ul of reaction buffer; P 32 -labelled 
dATP and 2ul of Klenow I; and H 2 0, to bring the 
reaction to 50p.l. The mixture was incubated in 37 °C 
for 1-4 hours, then stopped with 2yil of 0.5 M EDTA. 
The probe was denatured by incubating at 95 °C for 5 
minutes before use. 

RNA samples were prepared from ethylene treated 
and non-treated fresh leaves of several pairs of 
tobacco lines. In some cases poly A+ enriched RNA was 
used. Approximately 15ug total RNA or 1 . 8ug mRNA 

(methods of RNA and mRNA extraction as described in 
Example 5) were brought to equal volume with DEPC H 2 0 

(5-10 ul) . The same volume of loading buffer (1 x 
MOPS; 18.5 % Formaldehyde; 50 % Formamide; 4 % 
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Ficoll400; Bromophenolblue) and 0.5 ul EtBr (0.5 ug/uD 
were added. The samples were subsequently denatured in 
preparation for separation of the RNA by 
electrophoresis . 

Samples were subjected to electrophoresis on a 
formaldehyde gel (1 % Agarose, 1 x MOPS, 0.6 M 
Formaldehyde) with 1XMOP buffer (0.4 M 
Morpholinopropanesulfonic acid; 0.1 M Na-acetate-3 x 
H20; 10 mM EDTA; adjust to pH 7.2 with NaOH) . RNA was 
transferred to a Hybond-N-f membrane (Nylon, Amersham 
Pharmacia Biotech) by capillary method in 10 X SSC 
buffer (1.5 M NaCl; 0.15 M Na-citrate) for 24 hours. 
Membranes with RNA samples were UV- cross linked (auto 
crosslink setting, 254 nm, Stratagene, Stratalinker ) 
before hybridization. 

The membrane was prehybridized for 1-4 hours at 
42°C with 5-10 ml prehybridization buffer (5 x SSC; 50 

% Formamide; 5 x Denhardt T s — solution; 1 % SDS; 100|ig/ml 
heat-denatured sheared non- homologous DNA) . Old 
prehybridization buffer was discarded, and new 
prehybridization buffer and probe were added. The 
hybridization was carried out over night at 42 0 C* The 
membrane was washed for 15 minutes with 2 x SSC at room 
temperature, followed by a wash with 2 x SSC. 
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A major focus of the invention was the discovery 
of novel genes that may be induced as a result of 
ethylene treatment or play a key role in tobacco leaf 
quality and constituents. As illustrated in the table 
below, Northern blots and reverse Southern Blot were 
useful in determining which genes were induced by 
ethylene treatment relative to non-induced plants. 
Interestingly, not all fragments were affected 
similarly in the converter and nonconverter. The 
cytochrome p450 fragments of interest were partially 
sequenced to determine their structural relatedness. 
This information was used to subsequently isolate and 
characterize full length gene clones of interest. 



Fragments 


Induced mRNA Expression 
Ethylene Treatment 


Converter 


D56-AC7 (SEO ID No: 35) 


+ 


D56-AG11 (SEO ID No: 31) 




D56-AC12 (SEO ID No: 45) 


+ 


D7 0A-AB5 (SEO ID No: 95) 


+ 


D73-AC9 (SEO ID No: 43) 


+ 


D70A-AA12 (SEO ID No: 131) 


+ 


D73A-AG3 (SEO ID No: 129) 


+ 


D34-52 (SEO ID No: 61) 


+ 


D56-AG6 (SEO ID No: 51) 


+ 



Northern analysis was performed using full length 
clones on tobacco tissue obtained from converter and 
nonconverter burley lines that were induced by ethylene 
treatment. The purpose was to identify those full 
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length clones that showed elevated expression in 
ethylene induced converter lines relative to ethylene 
induced converter lines relative to ethylene induced 
nonconverter burley lines. By so doing, the 
functionality relationship of full length clones may be 
determined by comparing biochemical differences in leaf 
constituents between converter and nonconverter lines. 
As shown in table below, six clones showed 
significantly higher expression, as denoted by ++ and 
+++, in converter ethylene treated tissue than that of 
nonconverter treated tissue, denoted by +. All of 
these clones showed little or no expression in 
converter and nonconverter lines that were not ethylene 
treated. 



Full Length 
Clones 


Converter 


Nonconverter 


D101-BA2 


++ 


+ 


D207-AA5 


++ 


+ 


D208-AC8 


+++ 


+ 


D237-AD1 


++ 


+ 


D89-AB1 


++ 


+ 


D90A-BB3 


++ 


+ 



EXAMPLE 8: IMMUNODETECTION OF o45QS ENCODED BY THE CLONED 
GENES 

Peptide regions corresponding to 20-22 amino acids in 
length from three p450 clones were selected for 1) having 
lower or no homology to other clones and 2) having good 
hydrophilicity and antigenicity. The amino acid sequences 
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of the peptide regions selected from the respective p450 
clones are listed below. The synthesized peptides were 
conjugated with KHL and then injected into rabbits. 
Antisera were collected 2 and 4 weeks after the 4 th 
injection (Alpha Diagnostic Intl., Inc. San Antonio, TX) . 

D234-AD1 DIDGSKSKLVKAHRKIDEILG 
D90a-BB3 RDAFREKET FDENDVEELNY 
D8 9-AB1 FKNNGDEDRHFS QKLGDLADKY 

Antisera were examined for crossreactivity to target 
proteins from tobacco plant tissue by Western Blot analysis. 
Crude protein extracts were obtained from ethylene treated 
(0 to 4 0 hours) middle leaves of converter and nonconverter 
lines. Protein concentrations of the extracts were 
determined using RC DC Protein Assay Kit (BIO-RAD) following 
the manufacturer's protocol. 

Two micrograms of protein were loaded onto each lane 
and the proteins separated on 10% - 20% gradient gels using 
the Laemmli SDS-PAGE system. The proteins were transferred 
from gels to PROTRAN® Nitrocellulose Transfer Membranes 
(Schleicher & Schuell) with the Trans-Blot® Semi-Dry cell 
(BIO-RAD) . Target p450 proteins were detected and 
visualized with the ECL Advance™ Western Blotting Detection 
Kit (Amersham Biosciences) . Primary antibodies against the 
synthetic-KLH conjugates were made in rabbits. Secondary 
antibody against rabbit IgG, coupled with peroxidase, was 
purchased from Sigma. Both primary and secondary antibodies 
were used at 1:1000 dilutions. Antibodies showed strong 
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reactivity to a single band on the Western Blots indicating 
that the antisera were monospecific to the target peptide of 
interest. Antisera were also crossreactive with synthetic 
peptides conjuated to KLH. 

EXAMPLE 9: NUCLEIC ACID IDENTITY AND STRUCTURE RELATEDNE5S 
OF ISOLATED NUCLEIC ACID FRAGMENTS 

Over 100 cloned p450 fragments were sequenced in 
conjunction with Northern blot analysis to determine their 
structural relatedness. The approach used utilized forward 
primers based either of two common p450 motifs located near 
the carboxyl-terminus of the p450 genes. The forward 
primers corresponded to cytochrome p450 motifs FXPERF or 
GRRXCP (A/G) as denoted in Figure 1. The reverse primers 
used standard primers from either the plasmid, SP6 or T7 

located on both arms of pGEM™ plasmid, or a poly A tail. 
The protocol used is described below. 

Spectrophotometry was used to estimate the 
concentration of starting double stranded DNA following the 
manufacturer's protocol (Beckman Coulter). The template was 
diluted with water to the appropriate concentration, 
denatured by heating at 95' C for 2 minutes, and 
subsequently placed on ice. The sequencing reaction was 
prepared on ice using 0.5 to 10ul of denatured DNA template, 
2 ul of 1.6 pmole of the forward primer, 8 ]il of DTCS Quick 
Start Master Mix and the total volume brought to 2 0 \xl with 
water. The thermocycling program consisted of 30 cycles of 
the follow cycle: 96* C for 20 seconds, 50' C for 20 
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seconds, and 60 # C for 4 minutes followed by holding at 4* 
C. 

The sequence was stopped by adding 5 ul of stop buffer 
(equal volume of 3M NaOAc and lOOmM EDTA and 1 ul of 20 
mg/ml glycogen) . The sample was precipitated with 60 ul of 
cold 95% ethanol and centrifuged at 6000g for 6 minutes. 
Ethanol was discarded. The pellet was 2 washes with 200 ul 
of cold 7 0% ethanol. After the pellet was dry, 40 ul of SLS 
solution was added and the pellet was resuspended. A layer 
of mineral oil was over laid. The sample was then, placed 
on the CEQ 8000 Automated Sequencer for further analysis. 

In order to verify nucleic acid sequences, nucleic acid 
sequence was re-sequenced in both directions using forward 
primers to the FXPERF or GRRXCP(A/G) region of the p450 gene 
or reverse primers to either the plasmid or poly A tail. All 
sequencing was performed at least twice in both directions. 

The nucleic acid sequences of cytochrome p450 fragments 
were compared to each other from the coding region 
corresponding to the first nucleic acid after the region 
encoding the GRRXCP (A/G) motif through to the stop codon. 
This region was selected as an indicator of genetic 
diversity among p450 proteins. A large number of 
genetically distinct p450 genes, in excess of 70 genes, were 
observed, similar to that of other plant species. Upon 
comparison of nucleic acid sequences, it was found that the 
genes could be placed into distinct sequences groups based 
on their sequence identity. It was found that the best 
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unique grouping of p450 members was determined to be those 
sequences with 75% nucleic acid identity or greater (shown 
in Table I) . Reducing the percentage identity resulted in 
significantly larger groups, A preferred grouping was 
observed for those sequences with 81% nucleic acid identity 
or greater, a more preferred grouping 91% nucleic acid 
identity or greater, and a most preferred grouping for those 
sequences 99% nucleic acid identity of greater. Most of the 
groups contained at least two members and frequently three 
or more members. Others were not repeatedly discovered 
suggesting that approach taken was able to isolated both low 
and high expressing mRNA in the tissue used. 

Based on 75% nucleic acid identity or greater, two 
cytochrome p450 groups were found to contain nucleic acid 
sequence identity to previously tobacco cytochrome genes 
that genetically distinct from that within the group. Group 
23, showed nucleic acid identity, within the parameters used 
for Table I, to prior GenBank sequences of GI: 1171579 
(CAA64635) and GI: 14423327 (or AAK62346) by Czernic et al 
and Ralston et al, respectively. GI: 1171579 had nucleic 
acid identity to Group 23 members ranging 96.9% to 99.5% 
identity to members of Group 23 while GI: 14423327 ranged 
95.4% to 9 6.9% identity to this group. The members of Group 
31 had nucleic acid identity ranging from 7 6.7% to 97.8% 
identity to the GenBank reported sequence of GI: 14423319 
(AAK62342) by Ralston et al. None of the other p450 
identity groups of Table 1 contained parameter identity, as 
used in Table 1, to Nicotiana p450s genes reported by 
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Ralston et al, Czernic et al., Wang et al or LaRosa and Smigocki. 

As shown in Figure 7 6, consensus sequence with 
appropriate nucleic acid degenerate probes could be derived 
for group to preferentially identify and isolate additional 
members of each group from Nicotiana plants. 
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Table I: Nicotiana p450 Nucleic Acid Sequen ce Identity 
Groups 

GROUP FRAGMENTS 

1 D58-BG7 (SEQ ID No.:l), D58-AB1 (SEQ ID No. :3); D58-BE4 
(SEQ ID No. : 7) 

2 D56-AH7 (SEQ ID No.: 9); D13a-5 (SEQ ID No.: 11) 

3 D56-AG10 (SEQ ID No.:13); D35-33 (SEQ ID No.: 15); 
D34-62 (SEQ ID No.: 17) 

4 D56-AA7 (SEQ ID No .: 19) ; D56-AE1 (SEQ ID_ No . : 21) ; 
185-BD3 (SEQ ID No.:143) 

5 D35-BB7 (SEQ ID No.:23); D177-BA7 (SEQ ID No.:25); 
D56A-AB6 (SEQ ID No.: 27); D144-AE2 (SEQ ID No.: 29) 

6 D56-AG11 (SEQ ID No.:31); D179-AA1 (SEQ ID No.:33) 

7 D56-AC7 (SEQ ID No. : 35) ; D144-AD1 (SEQ ID No . : 37 ) 

8 D144-AB5 (SEQ ID No.:39) 

9 D181-AB5 (SEQ ID No.: 41); D73-Ac9 (SEQ ID No.: 43) 

10 D56-AC12 (SEQ IDNo.:45) 

11 D58-AB9 (SEQ ID No.: 47); D56-AG9 (SEQ ID No.: 49); 
D56-AG6 (SEQ ID No.:51); D35-BG11 (SEQ ID No.:53); D35-42 

(SEQ ID No.:55); D35-BA3 (SEQ ID No.:57); D34-57 (SEQ ID 
No.: 59); D34-52 (SEQ ID No.: 61); D34-25 (SEQ ID No.: 63) 

12 D56-AD10 (SEQ ID No.: 65) 

13 56-AA11 (SEQ ID No.: 67) 

14 D177-BD5 (SEQ ID No.: 69); D177-BD7 (SEQ ID No.: 83) 
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15 D56A-AG10 (SEQ ID No.: 71); D58-BC5 (SEQ ID No.:73); 
D58-AD12 (SEQ ID No. : 75) 

16 D56-AC11 (SEQ ID No.: 77); D35-39 (SEQ ID No.: 79); 
D58-BH4 (SEQ ID No.: 81); D56-AD6 (SEQ ID No.: 87) 

17 D73A-AD6 (SEQ ID No.:89); D70A-BA11 (SEQ ID No.:91) 

18 D70A-AB5 (SEQ ID No.: 95); D70A-AA8 (SEQ ID No.: 97) 

19 D70A-AB8 (SEQ ID No.:99); D70A-BH2 (SEQ ID No. : 101); 
D70A-AA.4 (SEQ ID No.:103) 

20 D70A-BA1 (SEQ ID No.:105); D70A-BA9 (SEQ ID No.:107) 

21 D70A-BD4 (SEQ ID No.: 109) 

22 D181-AC5 (SEQ ID No.: Ill); D144-AH1 (SEQ ID No. : 113); 
D34-65 (SEQ ID No.: 115) 

23 D35-BG2 (SEQ ID No.:117) 

24 D73A-AH7 (SEQ ID No.:119) 

25 D58-AA1 (SEQ ID No.: 121); D185-BC1 (SEQ ID No. : 133); 
D185-BG2 (SEQ ID No.:135) 

26 D73-AE10 (SEQ ID No.:123) 

27 D56-AC12 (SEQ ID No.: 125) 

28 D177-BF7 (SEQ ID No . : 127) ; D185-BE1 (SEQ ID No .: 137) ; 
D185-BD2 (SEQ ID No.:139) 

29 D73A-AG3 (SEQ ID No.: 129) 

30 D70A-AA12 (SEQ ID No.: 131); D176-BF2 (SEQ ID No.: 85) 

31 D176-BC3 (SEQ ID No.: 145) 

32 D176-BB3 (SEQ ID No.: 147) 
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D186-AH4 (SEQ ID No.: 5) 

EXAMPLE 10: RELATED AMINO ACID SEQUENCE IDENTITY OF 
ISOLATED NUCLEIC ACID FRAGMENTS 

The amino acid sequences of nucleic acid sequences 
obtained for cytochrome p450 fragments from Example 8 
were deduced. The deduced region corresponded to the 
amino acid immediately after the GXRXCP (A/G) sequence 
motif to the end of the carboxyl- terminus, or stop 
codon. Upon comparison of sequence identity of the 
fragments, a unique grouping was observed for those 
sequences with 7 0% amino acid identity or greater. A 
preferred grouping was observed for those sequences 
with 8 0% amino acid identity or greater, more preferred 
with 90% amino acid identity or greater, and a most 
preferred grouping for those sequences 99% amino acid 
identity of greater. The groups and corresponding 
amino acid sequences of group members are shown in 
Figure 2. Several of the unique nucleic acid sequences 
were found to have complete amino acid identity to 
other fragments and therefore only one member with the 
identical amino acid was reported. 

The amino acid identity for Group 19 of Table II 
corresponded to three distinct groups based on their 
nucleic acid sequences. The amino acid sequences of 
each group member and their identity is shown in 
Figure, 77. The amino acid differences are 
appropriated marked. 
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At least one member of each amino acid identity 
group was selected for gene cloning and functional 
studies using plants. In addition, group members that 
are differentially affected by ethylene treatment or 
other biological differences as assessed by Northern 
and Southern analysis were selected for gene cloning 
and functional studies. To assist in gene cloning, 
expression studies and whole plant evaluations, peptide 
specific antibodies will be prepared on sequence 
identity and differential sequence. 
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GROUP FRAGMENTS 

1 D58-BG7 (SEQ ID No.: 2), D58-AB1 (SEQ ID No.: 4) 

2 D58-BE4 (SEQ ID No . : 8 ) 

3 D56-AH7 (SEQ ID No.: 10); Dl3a-5 (SEQ ID No.: 12) 

4 D56-AG10 (SEQ ID 
No.: 14); D34-62 (SEQ ID No.: 18) 

5 D56-AA7 (SEQ ID No.:20); D56-AE1 (SEQ ID No.:22); 185- 
BD3 (SEQ ID No. : 144) 

6 D35-BB7 (SEQ ID No.:24); D177-BA7 (SEQ ID No.:26); 
D56A-AB6 (SEQ ID No.:28); D144-AE2 (SEQ ID No.:30) 

7 D56-AG11 (SEQ IDNo.:32); D179-AA1 (SEQ IDNo.:34) 

8 D56-AC7 (SEQ ID No.: 36); D144-AD1 (SEQ ID No.: 38) 

9 D144-AB5 (SEQ ID No.:40) 

10 D181-AB5 (SEQ ID No.:42); D73-Ac9 (SEQ ID No.: 44) 

11 D56-AC12 (SEQ ID No.:46) 

12 D58-AB9 (SEQ ID No.: 48); D56-AG9 (SEQ ID No.: 50); D56- 
AG6 (SEQ IDNo.:52); D35-BG11 (SEQ IDNo.:54); D35-42 (SEQ 
ID No.: 56); D35-BA3 (SEQ ID No.: 58); D34-57 (SEQ ID 
No.:60); D34-52 (SEQ ID No.:62) 

13 D56AD10 (SEQ ID No.: 66) 

14 56-AA11 (SEQ ID No.: 68) 

15 D177-BD5 (SEQ IDNo.:70); D177-BD7 (SEQ ID No.: 84) 
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16 D56A-AG10 (SEQ ID No.: 72); D58-BC5 (SEQ ID No.: 74); 
D58-AD12 (SEQ ID No. : 76) 

17 D56-AC11 (SEQ ID No.: 78); D56-AD6 (SEQ ID No.: 88) 

18 D73A-AD6 (SEQ ID No. 90:) 

19 D70A-AB5 (SEQ ID No.: 96); D70A-AB8 (SEQ ID No. : 100); 
D70A-BH2 (SEQ ID No.: 102); D70A-AA4 (SEQ ID No.:104); D70A- 
BA1 (SEQ ID No.: 106); D70A-BA9 (SEQ ID No.: 108) 

20 D70A-BD4 (SEQ ID No.:110) 

21 D181-AC5 (SEQ ID No.: 112); D144-AH1 (SEQ ID No. : 11.4); 
D34-65 (SEQ ID No.: 116) 

22 D35-BG2 (SEQ ID No.: 118) 

23 D73A-AH7 (SEQ ID No.:120) 

24 D58-AA1 (SEQ ID No.: 122); D185-BC1 (SEQ ID No. : 134); 
D185-BG2 (SEQ ID No.: 136) 

25 D73-AE10 (SEQ ID No.:124) 

26 D56-AC12 (SEQ ID No.: 126) 

27 D177-BF7 (SEQ ID No.:128); 185-BD2 (SEQ ID No.:140) 

28 D73A-AG3 (SEQ ID No.: 130) 

29 D70A-AA12 (SEQ ID No.: 132); D176-BF2 (SEQ ID No.: 86) 

30 D176-BC3 (SEQ ID No.:146) 

31 D176-BB3 (SEQ IDNo.:148) 

32 D186-AH4 (SEQ ID No.: 6) 
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EXAMPLE 11: RELATED AMINO ACID SEQUENCE IDENTITY OF FULL 
LENGTH CLONES 

The nucleic acid sequence of full length Nicotiana 
genes cloned in Example 5 were deduced for their entire 
amino acid sequence. Cytochrome p450 genes were identified 
by the presence of three conserved p450 domain motifs, which 
corresponded to UXXRXXZ, PXRFXF or GXRXC at the carboxyl- 
terminus where U is E or K, X is any amino acid and Z is P, 
T, S or M. It was also noted that two of the clones 
appeared nearly complete but lacked the appropriate stop 
codon, D130-AA1 and D101-BA2, however but both contained all 
three p450 cytochrome domains. All p450 genes were 
characterized for amino acid identity using a BLAST program 
comparing their full length sequences to each other and to 
known tobacco genes. The program used the NCBI special 
BLAST tool (Align two sequences (bl2seq) , 

http: //www. ncbi .nlm.nih.gov/blast/bl2sea/bl2 .html) . Two 
sequences were aligned under BLASTN without filter for 
nucleic acid sequences and B LAS TP for amino acid sequences. 
Based on their percentage amino acid identity, each sequence 
was grouped into identity groups where the grouping 
contained members that shared at least 85% identity with 
another member. A preferred grouping was observed for those 
sequences with 90% amino acid identity or greater, a more 
preferred grouping had 95% amino acid identity or greater, 
and a most preferred grouping had those sequences 99% amino 
acid identity or greater. Using these criteria, 25 unique 
groups were identified and are depicted in Table III. 
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Within the parameters used for Table III for amino acid 
identity, three groups were found to contain greater than 
85% or greater identity to known tobacco genes. Members of 
Group 5 had up to 96% amino acid identity for full length 
sequences to prior GenBank sequences of GI: 14423327 (or 
AAK62346) by Ralston et al . Group 23 had up to 93% amino 
acid identity to GI: 14423328 (or AAK62347) by Ralston et al . 
and Group 24 had 92% identity to GI: 14423318 (or AAK62343) 
by Ralston et al. 

Table III: Amino Acid Sequence Identity Groups of Full 
Length Nicotiana p450 Genes 

1 D208-AD9 (SEQ. ID. No. 224); D120-AH4 (SEQ. ID. No. 
180); D121-AA8 (SEQ. ID. No. 182), D122-AF10 (SEQ. ID. 
No. 184); D103-AH3 (SEQ. ID. No. 222); D208-AC8 (SEQ. 
ID. No. 218); D-235-ABI (SEQ. ID. No. 246) 

2 D244-AD4 (SEQ. ID. No. 250); D244-AB6 (SEQ. ID. No. 
274) ; D285-AA8; D285-AB9; D268-AE2 (SEQ. ID. No. 270) 

3 D100A-AC3 (SEQ. ID. No. 168); D100A-BE2 

4 D205-BE9 (SEQ. ID. No. 276); D205-BG9 (SEQ. ID. No. 
202); D205-AH4 (SEQ. ID. No. 294) 

5 D259-AB9 (SEQ. ID. No. 260) ; D257-AE4 (SEQ. ID. No. 
268); D147-AD3 (SEQ. ID. No. 194) 

6 D249-AE8 (SEQ. ID. No. 256); D-248-AA6 (SEQ. ID. No. 
254) 

7 D233-AG7 (SEQ. ID. No. 266; D224-BD11 (SEQ. ID. No. 
240) ; DAF10 

8 D105-AD6 (SEQ. ID. No. 172); D215-AB5 (SEQ. ID. No. 
220); D135-AE1 (SEQ. ID. No. 190) 
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9 D87A-AF3 (SEQ. ID. No. 216), D210-BD4 (SEQ. ID. No. 
262) 

10 D89-AB1 (SEQ. ID. No. 150); D89-AD2 (SEQ. ID. No. 152); 
163-AG11 (SEQ. ID. No. 198); 163-AF12 (SEQ. ID. No. 
196) 

11 D267-AF10 (SEQ. ID. No. 296); D96-AC2 (SEQ. ID. No. 
160); D96-AB6 (SEQ. ID. No. 158); D207-AA5 (SEQ. ID. 
No. 204); D207-AB4 (SEQ. ID. No. 206); D207-AC4 (SEQ. 
ID. No. 208) 

12 D98-AG1 (SEQ. ID. No. 164); D98-AA1 (SEQ. ID. No. 162) 

13 D209-AA12 (SEQ. ID. No. 212); D209-AA11; D209-AH10 
(SEQ. ID. No. 214); D209-AH12 (SEQ. ID. No. 232); 
D90a-BB3 (SEQ. ID. No. 154) 

14 D129-AD10 (SEQ. ID. No. 188); D104A-AE8 (SEQ. ID. No. 
170) 

15 D228-AH8 (SEQ. ID. No. 244); D228-AD7 (SEQ. ID. No. 
241), D250-AC11 (SEQ. ID. No'. 258); D247-AH1 (SEQ. 
ID. No. 252) 

16 D128-AB7 (SEQ. ID. No. 186) ; D243-AA2 (SEQ. ID. No. 
248); D125-AF11 (SEQ. ID. No. 228) 

17 D284-AH5 (SEQ. ID. No. 298); D110-AF12 (SEQ. ID. No. 
176) 

18 D221-BB8 (SEQ. ID. No. 234) 

19 D222-BH4 (SEQ. ID. No. 236) 

20 D134-AE11 (SEQ. ID. No. 230) 

21 D109-AH8 (SEQ. ID. No. 174) 

22 D13 6-AF4 (SEQ. ID. No. 27 8) 

23 D237-AD1 (SEQ. ID. No. 226) 

24 D112-AA5 (SEQ. ID. No. 178) 

25 D283-AC1 (SEQ. ID. No. 272) 
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The full length genes were further grouped based on the 
highly conversed amino acid homology between UXXRXXZ p450 
domain and GXRXC p450 domain near the end the carboxyl- 
terminus. As shown in Figure 3, individual clones were 
aligned for their sequence homology between the conserved 
domains relative to each other and placed in distinct 
identity groups. In several cases, although the nucleic 
acid sequence of the clone was unique, the amino acid 
sequence for the region was identical. The preferred 
grouping was observed for those sequences with 90% amino 
acid identity or greater, a more preferred group had 95% 
amino acid identity or greater, and a most preferred 
grouping had those sequences 99% amino acid identity of 
greater- The final grouping was similar to that based on 
the percent identity for the entire amino acid sequence of 
the clones except for Group 17 (of Table III) which was 
divided into two distinct groups . 

Within the parameters used for amino acid identity in 
Table IV, three groups were found to contain 90% or greater 
identity to known tobacco genes. Members of Group 5 had up 
to 93.4% amino acid identity for full length sequences to 
prior GenBank sequences of GI: 14423326 (AAK62346) by Ralston 
et al. Group 23 had up to 91 . 8% amino acid identity to 
GI : 14423328 (or AAK62347) by Ralston et al . and Group 24 had 
98.8% identity to GI:14423318 (or AAK62342) by Ralston et 
al. 

Table IV: Amino Acid Sequence Identity Groups of Regions 
between Conserved Domains of Nicotiana -p450 Genes 
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1 1 D208-AD9 (SEQ. ID. No. 224); D120-AH4 (SEQ. ID. No. 

180); D121-AA8 (SEQ. ID. NO. 182), D122-AF10 (SEQ. ID. 
No. 184); D103-AH3 (SEQ. ID. No. 222); D208-AC8 (SEQ. 
ID. No. 218); D-235-ABI (SEQ. ID. No. 246) 

2 D244-AD4 (SEQ. ID. No. 250); D244-AB6 (SEQ. ID. No. 
274) ; D285-AA8; D285-AB9; D268-AE2 (SEQ. ID. No. 270) 

3 D100A-AC3 (SEQ. ID. No. 168); D100A-BE2 

4 D205-BE9 (SEQ. ID. No. 276); D205-BG9 (SEQ. ID. No. 
202); D205-AH4 (SEQ. ID. No. 294) 

5 D259-AB9 (SEQ. ID. No. 260) ; D257-AE4 (SEQ. ID. No. 
268); D147-AD3 (SEQ. ID. No. 194) 

6 D249-AE8 (SEQ. ID. No. 256); D-248-AA6 (SEQ. ID. No. 
254) 

7 D233-AG7 (SEQ. ID. No. 266; D224-BD11 (SEQ. ID. No. 
240); DAF10 

8 D105-AD6 (SEQ. ID. No. 172); D215-AB5 (SEQ. ID. No. 
220); D135-AE1 (SEQ. ID. No. 190) 

9 D87A-AF3 (SEQ. ID. No. 216), D210-BD4 (SEQ. ID. No. 

262) 

10 D89-AB1 (SEQ. ID. No. 150); D89-AD2 (SEQ. ID. No. 152); 
163-AG11 (SEQ. ID. No. 198); 163-AF12 (SEQ. ID. No. 
196) 

11 D267-AF10 (SEQ. ID. No. 296); D96-AC2 (SEQ. ID. No. 
160); D96-AB6 ,(SEQ. ID. No. 158); D207-AA5 (SEQ. ID. 
No. 204); D207-AB4 (SEQ. ID. No. 206); D207-AC4 (SEQ. 
ID. No. 208) 

12 D98-AG1 (SEQ. ID. No. 164); D98-AA1 (SEQ. ID. No. 162) 

13 D209-AA12 (SEQ. ID. No. 212); D209-AA11; D209-AH10 
(SEQ. ID. No. 214); D209-AH12 (SEQ. ID. No. 232); 
D90a-BB3 (SEQ. ID. No. 154) 
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14 D129-AD10 (SEQ. ID. No. 188); D104A-AE8 (SEQ. ID. No. 
170) 

15 D228-AH8 (SEQ. ID. No. 244); D228-AD7 (SEQ. ID. No. 
241), D250-AC11 (SEQ. ID. No. 258); D247-AH1 (SEQ. 
ID. No. 252) 

16 D128-AB7 (SEQ. ID. No. 186) ; D243-AA2 (SEQ. ID. No. 
248); D125-AF11 (SEQ. ID. No. 228) 

17 D284-AH5 (SEQ. ID. No. 298); D110-AF12 (SEQ. ID. No. 
176) 

18 D221-BB8 (SEQ. ID. No. 234) 

19 D222-BH4 (SEQ. ID. No. 236) 

20 D134-AE11 (SEQ. ID. No. 230) 

21 D109-AH8 (SEQ. ID. No. 174) 

22 D136-AF4 (SEQ. ID. No. 278) 

23 D237-AD1 (SEQ. ID. No. 226) 

24 D112-AA5 (SEQ. ID. No. 178) 

25 D283-AC1 (SEQ. ID. No. 272) 

26 D110-AF12 (SEQ. ID. No. 176) 



EXAMPLE 12: NI COT I ANA CYTOCHROME P450 CLONES LACKING ONE OR 
MORE OF THE TOBACCO CYTOCHROME P450 SPECIFIC DOMAINS 

Four clones had high nucleic acid homology, ranging 90% 
to 99% nucleic acid homology, to other tobacco cytochrome 
genes reported in Table III. The four clones included D136- 
AD5, D138-AD12, D243-AB3 and D250-AC11. However, due to a 
nucleotide frameshift these genes did not contain one or 
more of three C-terminus cytochrome p450 domains and were 
excluded from identity groups presented in Table III or 
Table IV. 
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The amino acid identity of one clone, D95-AG1, did not 
contain the third domain, GXRXC, used to group p450 tobacco 
genes in Table III or Table IV. The nucleic acid homology 
of this clone had low homology to other tobacco cytochrome 
genes. This clone represents a novel and different group of 
cytochrome p450 genes in Nicotiana. 

EXAMPLE 13: USE OF NICOTT ANA CYTOCHROME P450 FRAGMENTS AND 
CLONES IN ALTER ED REGTTTi ART I ON OF TOBACCO PRQPERT TF.S 

The use of tobacco p450 nucleic acid fragments or whole 
genes are useful in identifying and selecting those plants 
that have altered tobacco phenotypes or tobacco constituents 
and, more importantly, altered metabolites. Transgenic 
tobacco plants are generated by a variety of transformation 
systems that incorporate nucleic acid fragments or full 
length genes, selected from those reported herein, in 
orientations for either down-regulation, for example anti- 
sense orientation, or over-expression for example, sense 
orienation. For over-expression to full length genes, any 
nucleic acid sequence that encodes the entire or a 
functional part or amino acide sequence of the full-length 
genes described in this invention are desired that are 
effective for increasing the expression of a certain enzyme 
and thus resulting in phenotypic effect within Nicotiana. 
Nicotiana lines that are homozygous lines are obtained 
through a series of backcrossing and assessed for phenotypic 
changes including, but not limited to, analysis of 
endogenous p450 RNA, transcripts, p450 expressed peptides 
and concentrations of plant metabolites using techniques 
commonly avaiable to one having ordinary skill in the art. 
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The changes exhibited in the tobacco plans provide 
information on the functional role of the selected gene of 
interest or are of a utility as a preffered Nicotiana plant 
species. 

Numerous modifications and variations in practice of the 
invention are expected to occur to those skilled in the art 
upon consideration of the foregoing detailed description of 
the invention. Consequently, such modifications and 
variations are intended to be included within the scope of 
the following claims „ 
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WHAT IS CLAIMED IS: 

1. An isolated nucleic acid molecule from Nicotiana, 
wherein said nucleic acid molecule comprises a nucleic 
acid sequence selected from the group consisting of 
SEQ. ID. No.:149, 151, 153, 155, 157, 159, 161, 163, 
165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 
187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 
209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 
231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 
253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 
275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295 
and 297. 

2. An isolated nucleic acid molecule from Nicotiana 
wherein said nucleic acid molecule compres a nucleic 
acid sequence selected from the group consiting of SEQ. 
ID. No. 299 through SEQ. ID. No. 357. 

3. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence selected from 
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196, 


198, 


200, 


202, 


204, 


206, 


208, 


210, 


212, 


214, 


216, 


218, 


220, 


222, 


224, 


226, 


228, 


230, 


232, 


234, 


236, 


238, 


240, 


242, 


244, 


246, 


248, 


250, 


252, 


254, 


256, 


258, 


260, 


262, 


264, 


266, 


268, 


270, 


272, 


274, 


276, 


278, 


280, 


282, 


284, 


286, 


288, 


290, 


292, 


294, 


296 


and 298. 
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An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 180 or 
SEQ. ID. No. 182, SEQ. ID. No. 184 or SEQ. ID. No. 224. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 218 or 
SEQ. ID no. 246. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 168. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 202, 204 
or SEQ. ID. No. 276. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 20, SEQ. 
ID. No. 260, or SEQ. ID. No. 268. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
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least 85% amino acid identity to SEQ. ID. No. 256 and 
SEQ. ID. No. 254. 

10. An isolated protein from Nicotiana, wherein said 
wherein said protein comprises an amino acid sequence 
comprising at least 85% amino acid identity to SEQ. ID. 
No. 266 or SEQ. ID. No. 240. 

11. An isolated protein from Nicotiana/ wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 172 , SEQ. 
ID. No. 190 or SEQ. ID. No. 220. 

12. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 216 or 
SEQ. ID. No. 262. 

13. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 50, SEQ. 
ID. No. 152, SEQ. ID. No. 196 or SEQ. ID No. 198. 

14. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 296, SEQ. 
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ID. No. 160, SEQ. ID. No. 158, SEQ. ID. No. 204 SEQ. 
ID. No. 206 and SEQ, ID. No. 208. 

15. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 162 or 
SEQ. ID. No. 164. 

16. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 212, 214 
238 or 254. 

17. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 188 or 
170. 

18. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 214, 241, 
258 or 252. 

19. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 186, SEQ. 
ID. No. 248, or SEQ. ID. No. 228. 
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20. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 298 or 
SEQ. ID. No. 17 6. 

21. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 234. 

22. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 236. 

23. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 230. 

24. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 174. 

25 An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 174. 
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26. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 226. 

27. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 178. 

28 o An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 85% amino acid identity to SEQ. ID. No. 272. 

29. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 180 or 
SEQ. ID. No. 182, SEQ. ID. No. 184 or SEQ. ID. No. 224. 

30. An isolated protein from Nicotiana, wherein said 
protein has at least 90% homology to an amino acid 
sequence comprising SEQ. ID. No. 218 or SEQ. ID no. 
246. 

31. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 168 
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An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 202, 204 
or SEQ. ID. No. 27 6. . 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 20, SEQ. 
ID. No. 260, or SEQ. ID. No. 268. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 256 and 
SEQ. ID. No. 254. 

An isolated protein from Nicotiana, wherein said 
wherein said protein comprises an amino acid sequence 
comprising at least 85% amino acid identity to SEQ. ID. 
No. 266 or SEQ. ID. No. 240. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 172, SEQ. 
ID. No. 190 or SEQ. ID. No. 220. 

An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
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least 90% amino acid identity to SEQ. ID. No. 216 or 
SEQ. ID. No. 2 62. 

38. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 50, SEQ. 
ID. No. 152, SEQ. ID. No. 196 or SEQ. ID No. 198. 

39. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 296, SEQ. 
ID. No. 160, SEQ. ID. No. 158, SEQ. ID. No. 204 SEQ. 
ID. No. 206 and SEQ. ID. No. 208 . 

40. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 162 or 
SEQ. ID. No. 164. 

41. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 212, 214 
238 or 254. 

42. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
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least 90% amino acid identity to SEQ. ID. No, 188 or 
170. 

43. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 214,, 241, 
258 or 252. 

44. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ . ID. No. 186, SEQ. 
ID. No. 248, or SEQ. ID. No. 228. 

45. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 298 or 
SEQ. ID. No. 176. 

f 

46. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 234. ' 

47. An isolated protein from Nicotiana, wherein said 
protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 23 6. 
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48. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 230. 



49. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 174. 



50. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 174. 



51. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 226. 



52. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 178. 



53. An isolated protein from Nicotiana, wherein said 

protein comprises an amino acid sequence comprising at 
least 90% amino acid identity to SEQ. ID. No. 272. 
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54. A transgenic plant, wherein said transgenic plant 
comprises the nucleic acid molecule of Claim 1 or 2 . 

55. The transgenic plant of Claim 54, wherein said plant is 
a tobacco plant* 

56. A method of producing a transgenic plant, wherein said 
method comprises the steps of: 

[i) operably linking said nucleic acid molecule of any one 
of the Claims 1 or 2 with a promoter functional in said 
plant to create a plant transformational vector; 

tii) transforming said plant with said plant 
transformational vector of step; 

;iii) selecting a plant cell transformed with said 
transformation vector; and 

iv) regenerating a transformation plant from said 
transformed plant cell. 

57. The method of Claim 56, wherein said nucleic acid 
molecule is in an antisense orientation. 

58. The method of Claim 56, wherein said nucleic acid 
molecule is in a sense orientation. 

59. The method of Claim 56, wherein said nucleic acid 
molecule is in a RNA interference orientation. 
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60. The method of Claim 56, wherein said nucleic acid 
molecule is expressed as a double stranded RNA 
molecule . 

61. The method of Claim 56, wherein said double stranded 
RNA molecule is about 15 to 25 nucleotides in length. 

62. The method of Claim 56, wherein said transgenic plant 
is a tobacco plant. 

63. A method of selecting a plant containing a nucleic acid 
molecule, wherein said plant is analyzed for the 



presence of nucleic acid sequence selected from the 



group 


i consisting of 


149, 


151, 


153, 


155, 


157, 


159, 


161, 


163, 


165, 


167, 


169, 


171, 


173, 


175, 


177, 


179, 


181, 


183, 


185, 


187, 


189, 


191, 


193, 


195, 


197, 


199, 


201, 


203, 


205, 


207, 


209, 


211, 


213, 


215, 


217, 


219, 


221, 


223, 


225, 


227, 


229, 


231, 


233, 


235, 


237, 


239, 


241, 


243, 


245, 


247, 


249, 


251, 


253, 


255, 


257, 


259, 


261, 


263, 


265, 


267, 


269, 


271, 


273, 


275, 


277, 


279, 


281, 


283, 


285, 


287, 


289, 


291, 


293, 



295 and 297. 



64. The method of selecting a 
said plant is analyzed by 

65. The method of selecting a 
said DNA hybridization is 



plant of Claim 63, wherein 
DNA hybridization o 

plant of Claim 64, wherein 
Southern blot analysis. 
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The method of selecting a plant of Claim 65, wherein 
said DNA hybridization is Northern blot analysis. 

The method of selecting a plant of Claim 66, wherein 
said plant is analyzed by PCR detection. 

The method of Claim 67, wherein said plant is a tobacco 
plant . 

The method of Claim 85, wherein said DNA hybridization 
comprises a nucleic acid probe, said nucleic acid probe 
is a nucleic acid fragment comprising a nucleic acid 
sequence selected from the group consisting of 149, 
151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 
173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 
195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 
217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 
239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 
261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 
283, 285, 287, 289, 291, 293, 295 and 297. 

The method of selecting a plant of Claim 69, wherein 
said plant is a transgenic plant. 

The method of selecting a plant of Claim 69, wherein 
said plant is selected from a mutagenesis population. 
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12. The method of selecting a plant of Claim 69, wherein 
said plant is selected from a breeding population . 

73. The method of selecting a plant of Claim 69, wherein 
said plant is selected from a Nicotiana. 
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FIG. 1 



SEQ ID 1 D58-BG7 

1 GCACAACTT GCTAT CAACT TGGTCACATC TATGTTGGGT 

61 CATTTGTTGC ATCATTTTAC ATGGGCTCCG GCCCCGGGGG TTAA.CCCGGA GGATATTGAC 

121 TTGGAGGAGA GCCCTGGAAC AGTAACTTAC ATGAAAAATC CAATACAAGC TATTCCAACT 

181 CCAAGATTGC CTGCACACTT GTAT GGACGT GTGCCAGTGG ATATGTAA 
SEQ ID 2 

AQ LAINLVT SMLGHLLHH FTWAPAP GVN P EDI DLEE S P GTVT YMKNP IQAI PT P RLPAHL YGRVP VDM 



FIG. 2 

SEQ ID 3 D58-AB1 

1 GCACAACT TGCTATCAAC TTGGTCACAT CTATGTT GGG 

61 TCATTTGTTG CATCATTTTA CGTGGGCTCC GCCCCGGGGG GTTAACCCGG AGAATATTGA 

121 CTT GGAGGAG AGCCCTGGAA CAGTAACTTA CATGAAAAAT CCAATACAAG CT AT T C CT AC 

181 TCCAAGATTG CCTGCACACT TGTATGGACG TGTGCCAGTG GATATGTAA 
SEQ ID 4 

AQ LAI N LVT SMLGHLLHH FT WAP P P GVN P EN I DLEE S P GT VT YMKN P I QAI P T P RL PAHL YGRVP VDM 



FIG. 3 

SEQ ID 5 D18 6-AH4 

1 ATGAATTAT T CATT GCAAG TGGAACACCT TTCAATTGCT 
61 CATAT GAT CC AAGGTTT CAG TTTTGCAACT ACGACCAATG AGCCTTTGGA TAT GAAAC AA 
121 GGTGTGGGTT TAACTTTACC AAAGAAGACT GATGTTGAAG TGCTAATTAC ACCTCGCCTT 
181 CCTCCTACGC TTTATCAATA TTAA 
SEQ ID 6 

MNYSLQVEHLSIAHMIQGFSFATTTNEPLDMKQGVGLTLPKKTDVEVLITPRLPPTLYQY 



FIG. 4 

SEQ ID 7 D58-BE4 

1 GCACAACTT GCTAT CAACT TGGTCACATC TATGTTGGGT 
61 CATTTGTTCA TCATTTTACA TGGGCTCCGG CCCCGGGGGT TAACC CGGAG GATATTGACT 
121 TGGAGGAGAG CCCTGGAACA GTAACTTACA TGA 
SEQ ID 8 

AQLAINLVTSMLGHLFIILHGLRPRGLTRRILTWRRALEQ 



FIG. 5 

SEQ ID 9 D56-AH7 

1 GAAGGATTG GCTGTTCGAA TGGTTGCCTT GTCATTGGGA 
61 TGTATTATTC AATGTTTTGA TTGGCAACGA ATCGGCGAAG AATTGGTTGA TATGACTGAA 
121 GGAACTGGAC TTACTTTGCC TAAAGCTCAA CCTTTGGTGG CCAAGTGTAG CCCACGACCT 
181 AA&AT GGCTA ATCTTCTCTC TCAGATTTGA 

SEQ ID 10 

EGIAVRMVALSLGCIIQCFDWQRIGEELVDMTEGTGLTLPKAQPLVAKCSPRPKMANLLSQI 
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FIG. 6 

SEQ ID 11 D13a-5 

1 GAAGGATTG GCTATTCGAA TGGTTGCATT GTCATTGGGA 
61 TGTATTATTC AATGCTTTGA TTGGCAACGA CTT GGGGAAG GATTGGTTGA TAAGACTGAA 
121 GGAACT GGAC TTACTTTGCC TAAAGCTCAA CCTTTAGTGG CCAAGT GTAG CCCACGACCT 
181 ATAAT GGCTA ATCTTCTTTC TCAGATTTGA 
SEQ ID 12 

EGLAIRMVALSLGCIIQCFDWQRLGEGLVDKTEGTGLTLPKAQPLVAKCSPRPIMANLLSQI 



FIG. 7 



ID 13 D56-AG10 

1 ATAGGTTTT GCGACTTTAG TGACACATCT GACTTTTGGT 
61 CGCTTGCTTC AAGGTTTTGA TTTTAGTAAG C CAT C AAAC A CGCCAATTGA CAT G ACAG AA 
121 GGCGTAGGCG TTACTTTGCC - TAAGGTTAAT CAAGTT GAAG TTCTAATTAC CCCTCGTTTA 
181 CCTT CTAAGC TTTATTTATT TTGA 
SEQ ID 14 

IGFATLVTHLTFGRLLQGFDFSKPSNTPIDMTEGVGVTLPKVNQVEVLITPRLPSKLYLF 



FIG. 8 

SEQ ID 15 D35-33 

1 ATAGGCTTT GCGACTTTAG TGACACATCT GACTTTTGGT 
61 CGCTTGCTTC AAGGTTTTGA TTTTAGTAAG CCATCAAACA CGCCAATTGA CAT G ACAG AA 
121 GGCGTAGGCG TTACTTTGCC TAAGGTTAAT CAAGTT GAAG TTCTAATTAC CCCTCGTTTA 
181 CCTTCTAAGC TTTATTTAT 
SEQ ID 16 

IGFATLVTHLTFGRLLQGFDFSKPSNTPIDMTEGVGVTLPKVNQVEVLITPRLPSKLYL 



FIG. 9 

SEQ ID 17 D34-62 

1 ATAAATTTT GCGACTTTAG TGACACATCT GACTTTTGGT 
61 CGCTTGCTTC AAGGTTTTGA TTTTAGTACG CCATCAAACA CGCCAATAGA CAT GAC AGAA 
121 GGCGTAGGCG TTACTTTGCC TAAGGTAAAT CAAGT GGAAG TTCTAATTAC CCCTCGTTTA 
181 CCTTCTAAGC TTTATGTATT CTGA 
SEQ ID 18 

INFATLVTHLTFGRLLQGFDFSTPSNTPIDMTEGVGVTLPKVNQVEVLISPRLPSKLYVF 



FIG. 10 

SEQ ID 19 D56AA7 

1 ATTATACTT GCATTGCCAA TTCTTGGCAT CACTTT GGGA 
61 CGTTTGGTTC AGAACTTTGA GCTGTTGCCT CCTCCAGGCC AGT CGAAGCT CGACACCACA 
121 GAGAAAGGTG GAC AGT T C AG TCTCCACATT TTGAAGCATT CCACCATTGT GTT GAAACCA 
181 AGGTCTTTCT GA 
SEQ ID 20 

1 1 LALPI LGITLGRLVQNFELLPPPGQS KLDTTEKGGQFSLHI LKHSTIVLKPRS F 
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FIG . 11 

SEQ ID 21 D56-AE1 

1 ATTATACTT GCATTGCCAA TTCTTGGCAT TACTTTGGGA 
61 CGTTTGGTTC AGAACTTTGA GCTGTTGCCT CCTCCAGGCC AGTCGAAGCT CGACACCACA 
121 GAGAAAGGTG GACAGTTCAG TCTCCATATT TTGAAGCATT CCACCATTGT GTTGAAACCA 
181 AGGTCTTGCT GA 
SEQ ID 22 

IILALPILGITLGRLVQNFELLPPPGQSKLDTTEKGGQFSLHILKHSTIVLKPRSC 



FIG. 12 

SEQ ID 23 D35-BB7 

1 TATT GCACTT GGGGTT GCAT CAATGGAACT TGCATTGTCA 
61 AATCTTCTTT AT GCATTT GA TTGGGAGTTA CCTTTTGGAA TGAAAAAAGA AGACATTGAC 
121 ACAAACGC CA GGCCTGGAAT TAG CAT GCAT AAGAAAAACG AACTTTATCT TAT C C CTAAA 
181 AATTATCTAT AG 
SEQ ID 24 

IALGVASMEIALSNLLYAFDWELPFGMKKEDIDTNARPGITMHKKNELYLIPKNYLPSKLYLF 



FIG. 13 

SEQ ID 25 D177-BA7 

1 ATT GCACTT G GGGTTGCATC CATGGAACTT 
121 GCTTTGTCAA ATCTTCTTTA T GCATTT GAT TGGGAGTTAC CTTACGGAGT GAAAAAAGAA 
181 AACATT GACA CAAATGTCAG GCCTGGAATT AC CAT GCAT A AGAAAAACGA ACTTTGCCTT 
241 AT C CCTAGAA ATTAT CTATA G 
SEQ ID 26 

IALGVASMEIALSNLLYAFDWELPYGVKKENIDTNVRPGITMHKKNELCLIPRNYL 



FIG. 14 

SEQ ID 27 D56A-AB6 

1 GGTATT GCAC TTGGGGTTGC ATCCATGGAA CTTGCTTTGT CAAATCTTCT TTAT GCATTT 
61 GATT GGGAGT TGCCTTATGG AGTGAAAAAA GAAGACATCG ACACAAACGT TAGGCCTGGA 
121 ATTGCCATGC ACAAGAAAAA CGAACTTTGC CTTGTCCCAA AAAATTATTT ATAA 

SEQ ID 28 . 

IALGVASMELALSNLLYAFDWELPYGVKKEDIDTNVRPGIAMHKKNELCLVPKNYL 



FIG. 15 

SEQ ID 29 D144-AE2 

1 ATT GCACTT GGGG TTGCATCCAT GGAACTTGCT 
61 TTGTCAAATC TTCTTTATGC ATT T GATT GG GAGTTGCCTT AT GGAGT GAA AAAAGAAGAC 
121 AT C G ACACAA ACGTTAGGCC TGGAATTGCC AT GCACAAGA AAAACGAACT TTGCCTTGTC 
181 CCAAAAAAAT TATTTATAAA TTATATTGGG AC GT G GAT CT CATGCTAG 
SEQ ID 30 

IALGVASMEIALSNLLYAFDWELPYGVKKEDIDTNVRPGIAMHKKNELCLVPKKLFINYIGTWISC 
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FIG. 16 



et»o TD 31 D56-AG11 

i ~=is SS =as sss 

181 CCTTATCAAC CTCCTCAAAA CTGA 

iSgSnaylpiaqllyhfdwelptgikfsdldltelvgvtaarksdlylvatpyqppqn 



FIG. 17 



ID "i SSSSS:^ tcgaaccaag cgacttggag 

121 TtSSgT SgtISgAGT AACTGCCGCT AGAAAAAGTG ACCTTTACTT GGTTGCGACT 

181 ccttatcaac ctcctcaaaa gtga 
iSg^aylpiaqllyhfdwklpagiepsdldltelvgvtaarksdlylvatpyqppqk 



FIG. 18 



C ri n in D56— AC7 

, ATrC TATTT GGTTTAGCTA ATGTTGGACA ACCTTTAGCT MmmB .^»n 
fil CAGTTACTTT ATCACTTCGA TTGGAAACTC CCTAATGGAC AAAGTCATGA GAATTTCGAC 



3 61 GGA 
SEQ ID 36 



MLFGX^GQPLAQLLYHFDWKLPNGQSHENFDMTESPGISATRKDDLVLIATPYDSY 



FIG. 19 



SEQ ID 37 D144-AD1 

1 ATGC TATTTGGTTT AGCTAAT GTT 



1 Ti^TCAGTrACTTTATCAC ITCGATTGGA AACTCCCTAA TGGACAAACT 

iS ScSaStt tSgac ggaatttctg ctacaagaaa GGATGATCTT 

181 ATTTTGATTG CCACTCCTGC TCATTCTTGA 

SSgSn^gqpiaqllyhfdwklpngqthqnfdmtespgisatrkddliliatpahs 



FIG. 20 



SEQ ID 39 D144-AB5 

1 TTAT TATTCGGTTT AGTTAATGTA mmrriiTTrriEV AGACTCTTCC T GGGATAAGT 

aS SS SSSSS SS SSSSJ GGATGATCTT 

181 tgtttaattg ctactccttt tggtctcaat taa 
SfgJwvghplaqli.yhfdwkt L pgiss DS fdmtetdgvtagrkddlcliatpfgl N 
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FIG. 21 

SEQ ID 41 D181-AB5 

1 A TGTCGTTTGG TTTAGTTAAC ACT GGGCAT C CTTTAGCTCA 

61 GTTGCTCTAT TTCTTTGACT GGAAATTCCC TCATAAGGTT AATGCAGCTG ATTTTCACAC 

121 TACT GAAACA AGTAGAGTTT TTGCAGCAAG CAAAGAT G AC CTCTACTTGA TTCCAACAAA 

181 T CACAT GGAG CAAGAGTAG 
SEQ ID 42 

MSFGLVNTGHPLAQLLYFFDWKFPHKVNAADFHTTETSRVFAASKDDLYLIPTNHMEQE 



FIG. 22 

SEQ ID 43 D73-AC9 

1 AT GTCGTTTGGT TTAGTTAACA CAGGGCATCC TTTAGCCCAG 

121 TTGCTCTATT GCTTTGACTG GAAACT CCCT GACAAGGTTA ATGCAAATGA TTTTCGCACT 

181 ACTGAAACAA GTAGAGTTTT TGCAGCAAGC AAAGATGACC T CTACTT GAT TCCCACAAAT 

241 CACAGGGAGC AAGAATAG 
SEQ ID 44 

MS FGLVNTGHPLAQLLYCFDWKLPDKVNANDFRTTETSRVFAASKDDLYLIPTNHREQE 



FIG. 23 

SEQ ID 45 D56-AC12 

1 ATGCAATTT GGTTTGGCTC TTGTTACTCT GCCATTGGCT 

61 CATTTGCTTC ACAATTTTGA TTGGAAACTT CCCGAAGGAA TTAATGCAAG GGATTTGGAC 

121 AT GACAG AG G CAAATGGGAT ATCTGCTAGA AGAGAAAAAG ATCTTTACTT GATT GCTACT 

181 CCTTAT GTAT CACCTCTTGA TTAA 
SEQ ID 46 

MQFGLALWLPLAHLLHNFDWKLPEGINARDLDMTEANGISARREKDLYLIATPYVSPLD 



FIG. 24 

SEQ ID 47 D58-AB9 

1 ATGACTTAT GCATTGCAAG TGGAACACCT AACAATGGCA 
61 CATTTGATCC AGGGTTTCAA TTACAGAACT CCAACT GAT G AGCCCTTGGA TAT GAAAGAA 
121 GGTGCAGGCA TAACTATACG TAAGGTAAAT CCTGTGAAAG TGATAATTAC GCCTCGCTTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 48 

OTYALQVEHLTMAHLIQGFNYRTPTDEPLDMKEGAGITIRKWPVKVIITPRLAPELY 



FIG. 25 

SEQ ID 49 D56-AG9 

1 ATGACTTAT GCATTGCAAG TGGAACACCT AACAATGGCA 
61 CATTTAATCC AGGGTTTCAA TTACAAAACT CCAAAT GACG AGGCCTT GGA TAT GAAGGAA 
121 GGTGCAGGCA TAACTATACG TAAGGTAAAT CCTGTGGAAC TGATAATAGC GCCTCGCCTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 50 

MT YALQVEHLTMAHL I QGFN YKT PNDEALDMKEGAGI T I RKVN P VELI I APRLAPEL Y 
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SEQ ID 51 D56-AG6 

1 ATGACTTAT GCATTGCAAG TGGAACACCT AACAAT GGCA m ^ m ^ rro 
61 CATTTAATCC AGGGTTT CAA TTACAAAACT C CAAAT G AC G AGGCCTTGGA TAT GAAGGAA 
121 GGTGCAGGCA T AACAAT AC G TAAGGTAAAT CCAGTGGAAT TGATAATAAC GCCTCGCTTG 
181 GCACCTGAGC TTTACTAA 
SEQ ID 52 

MTYALQVEHLTMAHLIQGFNYCT^ 
FIG. 27 



ID 53 D35-BG11 

1 ATGACTTAT GCATTGCAAG TGGAACACTT AACAAT GGCA 
61 CATTTGATCC AAGGTTTCAA TTACAGAACT C CAAAT G AC G AGCCCTT GGA TAT GAAGGAA 
121 GGTGCAGGCA TAACTATACG TAAGGTAAAT CCTGTGGAAC T GAT AATAGC GCCTCGCCTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 54 

MTYALQVEHLTMAHLIQGFNYRTPNDEPLDMKEGAGITIRKVNPVELIIAPRLAPELY 



FIG. 28 

SEQ ID 55 D35-42 

1 ATGACTTAT GCATTGCAAG TGGAACACTT AACAAT GGCA m * m „„^** 
61 CATTTGATCC AAGGTTTCAA TTACAGAACT CCAAATGACG AGCCCTTGGA TAT GAAGGAA 
121 GGTGCAGGCA TAACTATACG TAAGGTAAAT CCTGTGGAAC TGATAATAGC GCCCCTGGCA 
181 CCTGAGCTTT ATTAA 
SEQ ID 56 

MTYALQVEHLTMAHLIQGFNYRTPNDEPLDMKEGAGITIRKVNPVELIIAPLAPELY 



FIG. 29 

SEQ ID 57 D35-BA3 

1 ATGACTTAT GCATTGCAAG TGGAACACTT AACAAT GGCA 
61 CATTTGATCC AAGGTTTCAA TTACAGAACT CCAAATGACG AGCCCTTGGA TAT GAAGGAA 
121 GGTGCAGGCA TAACTATACG TAAGGTAAAT CCTGCGGAAC TGATAATAGC GCCTCGCCTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 58 

MTYALQVEHLTMAHLIQGFNYRTPNDEPLDMKEGAGITIRKVNPAELIIAPRLAPELY 



FIG. 30 

SEQ ID 59 D34-57 

1 ATGACTTAT GCATTACAAG TGGAACACCT AACAATAGCA 
61 CATTTGATCC AGGGTTT CAA TTACAAAACT CCAAATGACG AGCCCTTGGA TAT GAAGGAA 
121 GGT GCAGGAT T AAC CAT ACG TAAAGTAAAT CCTGTAGAAG TGACAACTAC GCCTCGCCTG 
181 GCACCTGAGC TTTATTAA 



ID 60 

MT YALQVEHLT I AHLI QGFN YKT PNDEP LDMKEGAGLT I RKVN P VEVTTTARLAPELY 
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SEQ ID 61 D34-52 

1 ATGACTTAT GCATTACAAG TGGAACACCT AACAATAGCA m , mM ^™ 
61 CATTTGATCC AGGGTTTCAA TTACAAAACT CCAAAT GACG AGCCCTTGGA TAT GAAGGAA 
121 GGT GCAGGAT TAACTATACG TAAAGTAAAT CCT GTAGAAG TGACAATTAC GGCTCGCCTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 62 

MT YALQVEHLT I AHLI QGFN YKT PNDE P LDMKEGAGLT I RKVN PVEVT I TARLAPEL Y 



FIG. 32 

SEQ ID 63 D34-25 

1 ATGACTTAT GCATTACAAG TGGAACACCT AACAATAGCA 
61 CATTTGATCC AGGGTTTCAA TTACAAAACT CCAAAT GACG AGCCCCTGGA TAT GAAGGAA 
121 GGT GCAGGAT TAACTATACG TAAAGTAAAT CCT GTAGAAG TGACAATTAC GGCTCGCCTG 
181 GCACCTGAGC TTTATTAA 
SEQ ID 64 

MT YALQVEHLT I AHLI QGFN YKT PNDEPLDMKEGAGLT I RKVNPVEVT I TARLAPEL Y 



FIG. 33 



SEQ ID 65 D56AD10 

1 TATAGCCTT GGACTTAAGG TTAT CCGAGT AACATTAGCC 
61 AACATGTTGC AT GGATT CAA CTGGAAATTA CCT GAAGGTA TGAAGCCAGA AGATATAAGT 
121 GTGGAAGAAC ATTATGGGCT CACTACACAT CCTAAGTTT C CTGTTCCTGT GATCTTGGAA 
181 TCTAGACTTT CTTCAGATCT CTATTCCCCC ATCACTTAA 



FIG. 34 



SEQ ID 67 D56-AA11 

1 ATACAGTCTT GGGATTCGTA TAATTAGGGC AACTTTAGCT 
61 AACTTGTTGC AT GGATT CAA CTGGAGATTG CCTAATGGTA TGAGTCCAGA AGACATTAGC 
121 AT GGAAGAGA TTTATGGGCT AATTACACAC CCCAAAGTCG CACTTGACGT GAT GAT GGAG 
181 CCTCGACTTC CCAACCATCT TTACAAATAG 

SEQ ID 68 

YSLGIRIIRATLANLLHGFNWRLPNGMSPEDISMEEIYGLITHPKVALDVMMEPRLPNHLYK 



FIG. 35 

SEQ ID 69 D177-BD5 

1 ATTAATTTTT CAATACCACT TGTTGAGCTT 
121 GCACTTGCTA AT CTATT GTT T CATTATAAT TGGTCACTTC CTGAAGGGAT GCTAGCTAAG 
181 GAT GTT GAT A • TGGAAGAAGC TTTGGGGATT AC CAT GCACA AGAAATCTCC CCTTTGCTTA 
241 GTAGCTTCTC ATTATACTTG TTGA 
SEQ ID 70 

INFSIPLVEIAIANLLFHYNWSLPEGMIAKDVDMEEALGITMHKKSPLCLVASHYTC 
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SEQ ID 71 D56A-AG10 

1 ATGCAACTTG GGCTTTATGC ATT GGAAAT G GCTGTGGCCC ATCTTCTTCA TTGTTTTACT 
61 TGGGAATTGC CAGAT GGTAT GAAACCAAGT GAGCTTAAAA TGGATGATAT TTTTGGACTC 
121 ACTGCTCCAA AAGCTAATCG ACTCGTGGCT GTGCCTACTC CACGTTTGTT GTGTCCCCTT 
181 TATTAATTGA 
SEQ ID 72 

MQLGLYALEMAVAHLLHCFTWELPDGMKPSELKMDDIFGLTAPKANRLVAVPTPRLLCPLY 



FIG. 37 

SEQ ID 73 58-BC5 

1 ATGCAACTT GGGCTTTATG CATTAGAAAT GGCAGTGGCC 
61 CATCTTCTTC TTTGCTTTAC TTGGGAATTG CCAGAT GGTA TGAAACCAAG TGAGCTTAAA 
121 AT GGAT GAT A TTTTTGGACT CACTGCTCCA AGAGCTAAT C GACTCGTGGC TGTGCCTAGT 
181 CCACGTTTGT TGTGCCCACT TTATTAA 
SEQ ID 74 

MQLGLYALEMAVAHLLLCFTWELPDGMKPSELKMDDIFGLTAPRANRLVAVPSPRLLCPLY 



FIG. 38 

SEQ ID 75 D58-AD12 

1 ATGCAACTT GGGCTTTATG CAT T GGAAAT GGCTGTGGCC 
61 CATCTTCTTC ATTGTTTTAC TTGGGAATTG CCAGAT GGTA T GAAACCAAG TGAGCTTAAA 
121 AT GGAT GATA TTTTTGGACT CACTGCTCCA AGAGCTAAT C GACTCGTGGC TGTGCCTACT 
181 CCACGTTTGT TGTGTCCCCT TTATTAA 
SEQ ID 76 

MQLGLYALEMAVAHLLHCFTWELPDGMKPSELKMDDIFGLTAPRANRLVAVPTPRLLCPLY 



FIG. 39 

SEQ ID 77 D56-AC11 

1 ATGCTTTGG AGTGCGAGTA TAGTGCGCGT CAGCTACCTA 
61 ACTTGTATTT ATAGATTCCA AGTATATGCT GGGTCTGTGT TCAGAGTAGC ATGA 
SEQ ID 78 

MLWSAS IVRVS YLTCI YRFQVYAGSVFRVA 



FIG. 40 

SEQ ID 79 D35-39 

1 ATGCTTTGG AGTGCGAGTA TAGTGCGCGT CAGCTACCTA 
61 ACTTGTATTT ATAGATTCCA AGTATATGCT GGGTCTGTGT TCAGAGTAGC ATGA 
SEQ ID 80 

MLW S AS IVRVS YLT C I YRFQVYAG S VFRVA 
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SEQ ID 81 D58-BH4 

1 ATGCTTTGG AGTGCGAGTA TAGTGCGCGT CAGCTACCTA 
61 ACCTGTATTT ATAGATTCCA AGTATATGCT GGGTCTGTGT TCAGAGTAGC ATGA 
SEQ ID 82 

MLW S AS I VRVS YLT C I YRFQVYAGS VFRVA 
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SEQ ID 83 D177-BD7 

1 ATTAATTTTT CAATACCACT TGTTGAGCTT GCACTT GCTA AT CTATT GTT T CAT T AT AAT 
61 TGGTCACTTC CTGAGGGGAT GCTACCTAAG GAT GTT GAT A TGGAAGAAGC TTTGGGGATT 
121 ACCAT GCACA AGAAATCTCC CCTTTGCTTA GTAGCTTCTC ATTATAACTT GTTGTGA 
SEQ ID 84 

INFSIPLVEIAIANLLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL 



FIG. 43 
SEQ ID 85 

D176-BF2 

1 AT AT CATTT GGT TTGGCTAATG TTTATTTGCC ACT AGCT CAA 
121 TTGTTATATC ATTTTGATTG GAAACTCCCT ACT GGAAT CA ATTCAAGTGA CTTGGACATG 
181 ACT GAGT CGT CAGGAGTAAC TTGTGCTAGA AAGAGT GATT TAT ACT T GAC TGCTACTCCA 
241 TAT CAACTTT CTCAAGAGTG A 
SEQ ID 86 

GISFGIANVYLPIAQLLYHFDWKLPTGINSSDLD^ESSGVTCARKSDLYLTATPYQLSQE 



FIG. 44 

SEQ ID 87 D56-AD6 

1 ATGCTTTGG AGTGCGAGTA TAGTGCGCGT CAGCTACCTA 
61 ACTT GTATTT ATAGATTCCA AGTATATGCT GGGTCTGTGT CCAGAGTAGC ATGA 
SEQ ID 88 

MLWS AS I VRVS YLT C I YRFQVYAG S VS RVA 



FIG. 45 

SEQ ID 89 D73A-AD6 

1 CT GAATTTTGCA AT GTTAGAGG CAAAAATGGC ACTTGCATTG 
121 AT T CTACAAC ACTATGCTTT TGAGCTCTCT C CAT CTT AT G CACATGCTCC T CAT ACAAT T 
181 AT CACT CT GC AACCTCAACA TGGTGCTCCT TTGATTTTGC GCAAGCT GTA G 
SEQ ID 90 

LNFAMLEAKMALALI LQHYAFELS P S YAHAPHT 1 1 TLQPQHGAP LI LRKL 
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SEQ ID 91 D70A-BA11 

1 CT GAATTTTGCA ATGTTAGAGG CAAAAATGGC ACTTGCATTG 
121 ATTCTACAAC ACTATGCTTT TGAGCTCTCT CCATCTTATG CACACGCTCC TCATACAATT 
181 ATCACTCTGC AACCTCAACA TGGTGCTCCT TTGATTTTGC GCAAGCT GTA G 
SEQ ID 92 

LN FAML EAKMALAL I LQH YAFE L S P S YAHAPHT IITLQPQH GAP L I L RKL 



FIG. 47 

SEQ ID 93 D70A-BB5 

1 AA TAATTTTGCA ATGTTGGAAA CTAAGATTGC CTTAGCAATG 
121 ATCCTACAGC GTTTTGCTTT CGAGCTTTCT CCATCTTACG CTCATGCACC TACTTATGTC 
181 GTCACTCTTC GACCTCAGTG TGGTGCTCAC TTAATCTTGC AAAAATTATA GGT CCTTAAT 
241 CTGGATTTCC CATTATT GAG TAGTGCCTAA TAAATCTTCT CTATCACTAT TTTTCCATCT 
301 TTCA ^ 
SEQ ID 94 

NNFAMLETKIALAMILQRFAFELSPSYAHAPTYWTLRPQCGAHLILQKL 



FIG. 48 

SEQ ID 95 D70A-AB5 

1 AGCGAAGGGG TGGCAAAGGC AACAAAGGGG AAAAT GAC AT ATTTTCCATT T GGT GCAGGA 
61 CCGCGAAAAT GCATTGGGCA AAACTTCGCG ATTTTGGAAG CAAAAATGGC TATAGCTATG 
121 ATTCTACAAC GCTTCTCCTT CGAGCTCTCC CCAT CTTATA CACACTCTCC ATACACTGTG 
181 GTCACTTTGA AACC CAAATA TGGTGCTCCC CTAATAATGC ACAGGCT GTA GTCCTGTGAG 
241 AATAT GCTAT CCGAGGAATT CAGTTCCT 
SEQ ID 96 

QNFAILEAKMA.IAMILQRFSFELSPSYTHSPYTVVTLKPKYGAPLIMHRL 



FIG. 49 

SEQ ID 97 D70A-AA8 

1 AGCGAAGGGG TGGCAAAGGC AACAAAGGGG AAAAT GACAT ATTTTCCATT TGGTGCAGGA 
61 CCGCGAAAAT GCATTGGGCA AAACTTCGCG ATTTTGGAAG CAAAAATGGC TATAGCTATG 
121 ATTCTACAAC GCTTCTCCTT CGAGCTCTCT CCAT CTTATA CACACTCTCC ATACACTGTG 
181 GTCACTTTGA AACC CAAATA TGGTGCTCCC CTAATAATGC ACAGGCT GTA GTCCTGT 
SEQ ID 98 

QNFAI LEAKMAIAMI LQRFS FELS PS YTHS P YTWTLKPKYGAPLIMHRL 



FIG. 50 



SEQ ID 99 D70A-AB8 

1 C AAAATTTTGC CAT GTTAGAA GCAAAGATGG CTCTGTCTAT GATCCTGCAA 
121 CGCTTCTCTT TTGAACTGTC TCCGTCTTAT GCACATGCCC CTCAGTCCAT ATTAACCGT 
181 CAGCCACAAT ATGGTGCTCC ACTTATTTTC CACAAGCTAT AA 
SEQ ID 100 

QN FAML EAKMAL SMI LQRFS FELS PS YAHAPQS I LTVQPQYGAPLI FHKL 
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SEQ ID 101 D70A-BH2 

1 AT AAACTTTGCA ATGACAGAAG CGAAGATGGC TATGGCTATG 
121 ATTCTGCAAC GCTTCTCCTT TGAGCTATCT CCAT CTTACA CACATGCTCC ACAGTCTGTA 
SEQ ID III ATAACTATGC ^CCCAATA TGGTGCTCCT CTTATATTGC ACAAATTGTA A^ GTA 

INFAMTEAKMAMAMILQRFSFELSPSYTHAPQSVITMQPQYGAPLILHKL 
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SEQ ID 103 D70A-AA4 

1 AT AAACTTTGCA AT GGCAGAAG CGAAGATGGC TATGGCTATG 
121 ATTCTGCAAC GCTTCTCCTT TGAGCTATCT CCATCTTACA CACATGCTCC ACAGTCTGTA 
181 ATAACTATGC AACCCCAATA TGGTGCTCCT CTTATATTGC ACAAATTGTA £ CAGTCTGTA 
SEQ ID 104 

INFAMAEAKMAMAMILQRFSFELSPSYTHAPQSVITMQPQYGAPLILHKL 



FIG. 53 



SEQ ID 105 D70A-BA1 

1 CA AAACTTTGCA AT GAT GGAAG CAAAAATGGC AGTAGCTATG 
J« ^ ACTACAAA AATTTTCCTT TGAACTATCC CCTTCTTATA CACATGCTCC ATTTGCAATT 
181 GTGACTATTC AT CCT CAGTA TGGTGCTCCT CTGCTTATGC GCAGACTTTA ATTTGCAATT 



SEQ ID 106 

QNFAMMEAKMAVAMILQKFSFELSPSYTHAPFAIVTIHPQYGAPLLMRRL 



FIG. 54 



SEQ ID 107 D70A-BA9 

1 CA AAACTTTGCA AT GAT GGAAG CAAAAATGGC AGTAGCTATG 
121 ATACTACATA AATTTTCCTT TGAACTATCC CCTTCTTATA CACATGCTCC ATTTGCAATT 
181 GTGACTATTC ATCCTCAGTA TGGTGCTCCT CTGCTTATGC GCAGACTTTA A GCAATT 
SEQ ID 108 

QN FAMMEAKMAVAMI LHKFS FEL P S YTHAP FAI VTI H PQ YGAP LLMRRL 
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SEQ ID 109 D70A-BD4 

1 CA AAATTTTGCT ATGTTAGAGG CTAAAATGGC AAT GGCTATG 
121 ATTCTGAAAA CCTATGCATT TGAACTCTCT CCATCTTATG CTCATGCTCC TCATCCACTA 
181 CTACTTCAAC CTCAATATGG TGCTGAATTA ATTTTGTACA AGTTGTAG TCATCCACTA 
SEQ ID 110 

QNFAMLEAKMAMAMILKTYAFELSPSYAHAPHPLLLQPQYGAQLILYKL 
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SEQ ID 111 D181-AC5 

1 TATAGCATGG GGCTCAAGGC GATTCAAGCT AGCTTAGCTA 

61 ATCTTCTACA TGGATTTAAC TGGTCATTGC CTGATAATAT GACTCCTGAG GACCT CAACA 

121 TGGATGAGAT TTTTGGGCTC TCTACACCTA AAAAATTTCC ACTTGCTACT GT GAT T GAG C 

181 CAAGACTTTC AC CAA?\ACTT TACTCTGTTT GA 
SEQ ID 112 

YSMGLKAIQASIANLLHGFNWSLPDNMTPEDLNMDEIFGLSTPKKFPIATVIEPRLSPKLYSV 
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SEQ ID 113 D144-AH1 

1 TAT AGCTT GGGGC TCAAGGAGAT TCAAGCTAGC 
61 TTAGCTAATC TTCTACATGG ATTTAACTGG TCATTGCCTG AT AAT AT GAC TCCTGAGGAC 
121. CTCAACATGG AT GAGATTTT TGGGCTCTCT ACACCTAAAA AATTT C C ACT TGCTACTGTG 
181 ATTGAGCCAA GACTTTCACC AAAACTTTAC TCTGTTTGA 
SEQ ID 114 

YSLGLKEIQASIANLLHGFNWSLPDNMTPEDLNMDEIFGLSTPKKFPLATVIEPRLSPKLYSV 



FIG. 58 

SEQ ID 115 D34-65 

1 CATAGCTTG GGGCT CAAGG TGATTCAAGC TAGCTTAGCT 
61 AATCTTCTAC ATGGATTTAA CTGGTCATTG CCT GATAATA TGACTCCTGA GGAC CT CAAC 
121 AT GGAT GAGA TTTTTGGGCT CTCTACACCT AAAAAATTTC CACTT GCTAC TGTGATTGAG 
181 CCAAGACTTT CACCAAAACT TTACTCTGTT TGA 
SEQ ID 116 

HSLGLKVIQASIANLLHGFNWSLPDNMTPEDLNMDEIFGLSTPKKFPLATVIEPRLSPKLYSV 



FIG. 59 

SEQ ID 117 D35-BG2 

1 CTGTGCTTT CCATGTTTAA TCTCTAGTTA TATACT GGCT 
61 TTGAATGTGA ATCTGTATCA TAATTTCTTG CAAATTTCTC CTTCCATTTC TTATTAA 
SEQ ID 118 

LCFPCLI S S YI LALNVNLYHNFLQI S P S I S Y 



FIG. 60 

SEQ ID 119 D73A-AH7 

1 TCTG GACTTGCTCA ATGTGTGGTT GGTTTAGCTT TAGCAACTCT AGTGCAGTGT 
121 TTTGAGTGGA AAAGGGTAAG CGAAGAGGTG GTTGATTTGA CGGAAGGAAA AGGTCTCACT 
181 AT GCCAAAAC CCGAGCCACT CATGGCTAGG TGCGAAGCTC GTGACATTTT TCACAAAGTT 
2 41 CTTT CAGAAA TATCTTAA 
SEQ ID 120 

S GLAQCWGLALAT LVQCFEWKRVS EEWDLT EGKGLTMP KP EPLMARCEARDIFHKVLSEIS 
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SEQ ID 121 D58-AA1 

1 TTGGGCTTG GCAACGGTGC ATGTGAATTT GATGTTGGCC 
61 CGAATGATTC AAGAATTTGA ATGGTCCGCT TACCCGGAAA ATAGGAAAGT GGATTTTACT 
121 GAGAAATTGG AATTTACTGT GGTGATGAAA AATCCTTTAA GAGCTAAGGT CAAGCCAAGA 
181 ATGCAAGTGG TGTAA 
SEQ ID 122 

LGLAT VHWl^LARMI QEFE^ 



FIG. 62 

SEQ ID 123 D73A-XE10 

1 TATGCTT TGGCTATGCT TCATTTAGAG 

121 TACTTTGTGG CTAATTTGGT TTGGCATTTT CGATGGGAGG CTGTGGAGGG AGATGATGTT 
181 GATCTTTCAG AAAAGCTAGA ATTCACCGTT GTGATGAAGA ATCCACTTCG AGCTCGTATC 
241 TGCCCCAGAG TTAACT CTAT TTGA 
SEQ ID 124 

YAIAMLHLEYFVANLWHFRWEAVEGDDVDLSEKLEFTVVMKNPLRARICPRW 
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SEQ ID 125 D56A-AC12 

1 GGTCAGCAAG TTGGACTTCT TAGAACAACC ATTTTCATCG CCT CAT TACT GTCTGAATAT 
61 AAGCTGAAAC CTCGCTCACA CCAGAAACAA GTTGAACTCA CCGATTTAAA TCCAGCAAGT 
121 TGGCTTCATT C GAT AAAAGG CGAACTGTTA GTCGATGCGA TTCCTCGAAA GAAGGCGGCA 
181 TTTTAA 
SEQ ID 126 

GQQVGLLRTTIFIASLLSEYKLKPRSHQKQVELTDLNPASWLHSIKGELLVDAIPRKKAA.F 



FIG. 64 

SEQ ID 127 D177-BF7 

1 ATCACATTTG CTAAGTTTGT GAAT GAGCT A 
121 GCATT GGCAA GATTAATGTT CCATTTTGAT TTCTCGCTAC CAAAAGGAGT TAAGCAT GAG 
181 GATTTGGACG TGGAGGAAGC TGCTGGAATT ACT GTT AGAA GGAAGTT CCC CCTTTTAGCC 
241 GTCGCCACTC CATGCTCGTG A 
SEQ ID 128 

ITFAKFVNEIiAIARLMFHFDFSLPKGVKHEDLDVEEAAGITVRRKFPLIAVATPCS 



FIG. 65 

SEQ ID 129 D73A-AG3 

1 CA GAGGTAT GCT ATAAACCATT TGATGCTCTT TATTGCGTTG 
121 TTCACGGCTC TGATTGATTT CAAGAGGCAC AAAACGGACG GCTGTGATGA TATCGCGTAT 
181 ATTCCAACCA TTGCTCCAAA GGATGATTGT AAAGTGTTCC TTTCACAGAG GTGCACTCGA 
241 TTCCCATCTT TTTCATGA 
SEQ ID 130 

QRYAINHLMLFIALFTALIDFKRHKTDGCDDIAYIPTIAPKDDCKVFLSQRCTRFPSFS 
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SEQ ID 131 D70A-AA12 

1 ATG TCATTTGGTT TAGCTAATCT TTACTTACCA TTGGCTCAAT 
121 TACT CT AT CA CTTTGACTGG AAACT CC CAA CCGGAAT CAA GCCAAGAGAC TTGGACTTGA 
181 CCGAATTATC GGGAATAACT AT T GCTAGAA AGGGT GACCT TTACTTAAAT GCTACTCCTT 
241 ATCAACCTTC TCGAGAGTAA 
SEQ ID 132 

MSFGLANLYLPLAQLLYHFDWKLPTGIKPRDLDLTELSGITIARKGDLYLNATPYQPSRE 
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SEQ ID 133 D185-BC1 

1 TTGGGCTTG GCAACGGTGC AT GT GAATTT GATGTTGGCC 
61 CGAACGATTC AAGAATTTGA ATGGTCCGCT TACCCGGAAA ATAGGAAAGT GGATTTtACT 
121 GAGAAATTGG AATTTACTGT GGTGATGAAA AACCCTTTAA GAGCTAAGGT CAAGCCAAGA 
181 AT GCAAGT GG TGTAA 
SEQ ID 134 

LGLATVHVNLMIiARTIQEFEW 
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SEQ ID 135 D185-BG2 

1 TTGGGCTTG GCAACGGTGC AT GT GAATTT GATGTTGGCC 

61 CGAATGATTC AAGAATTTGA ATGGTCCGCT TACCCGGAAA ATAGGAAAGT GGATTTACTG 
121 AGAAATTGGA ATTTACT GT G GTGA 
SEQ ID 136 

LGIATVHWLMIARMIQEFEWSAYPENRKVDLLRNWNLLW 
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SEQ ID 137 D185-BE1 

1 ATCACATTT GCTAAGTTTG TGAATGAGCT AGCATT GGCA 
61 AGATTAAT GT TCCATTTTGA TTTCTCGCTA CCAAAAGGAG TTAAGCATGA GGATTTGGAC 
121 GT GGAGGAAG CTGCTGGAAT TACT GTTAGG AGGAAGTTCC CCCTTTTAGC CGTCGCCACT 
181 CCATGCTCGT GA 
SEQ ID 138 

I T FAK FVN E LALARLMFH FD F S L P KGVKHE DLD VEEAAGI T VRRK FP LLAVAT PCS 



FIG. 70 

SEQ ID 139 D185-BD2 

1 ATCACATTT GCTAAGTTTG TGAATGAGCT AGCATT GGCA 
61 AGATTAAT GT TCCATTTTGA TTTCTCGCTA CCAAAAGGAG TTAAGCATGC GGATTTGGAC 
121 GT GGAGGAAG CTGCTGGAAT TACT GT TAGA AGGAAGTTCC CCCTTTTAGC CGTCGCCACT 
181 CCATGCTCGT GA 
SEQ ID 140 

I T FAKFVNEIALARLMFH FDFS L P KGVKHAD LDVEEAAG I T VRRKFP LLAVAT PCS 
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SEO ID 141 D176-BG2 

1 CA AAATTTTGCC ATGTTAGAAG CAAAGACTAC TTTGGCTATG m ^ rprv ^_ a 
121 ATCCTACAAC GCTTCTCCTT TGAACTGTCT CCATCTTATG CACATGCTCC TCAGTCCATA 
Si SSS ScCCCAGTA TGGTGCTCCA CTTATTTTGC ATAAAATATA G 
SEQ ID ' 142 

QNFAMLEAKTTLAMILQRFS FELS PS YAHAPQS I ITLQPQYGAPLI LHKI 



FIG. 72 



SEO ID 143 D185-BD3 

1 ATTATCCTT GCACTGCCAA TTCTTGGCAT TACCTTGGGA „„ 
61 CGCTTGGTGC AGAACTTTGA GTTGTTGCCT CCTCCAGGAC AGT CAAAGCT T GACACAACA 

iS gagaaaggcg SgSattcag tctgcacatt ttgaagcatt ccaccattgt gatgaaacca 

181 AGATCTTTTT AA 
ilLALPILGITLGRiVQNFELLPPPGQSKLDTTEKGGQFSLHILKHSTIVMKPRSF 
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SEO id 145 D176-BC3 

i r laaaTTTTGC CATGTTAGAA GCAAAGACTA CTTTGGCTAT 
12^ GA^lcAA CGCOTCTCCT TTGAACTGTC TCCATCTTAT GCACATGCTC CTCAGTCCAT 
ill SSSSS ScCCCAGTA TGGTGCTCCA CTTATTTTGC ATAAAATATA GTTTATTACT 
111 TGTAAGTAGT OTCTCGTTTT ATGTTAAGCA T GAGT CCAAA ATGTTAAGGC T T GTAGAACT 
301 GCAAAATGGG AATGCATTTG CACTCGTGCA CTGTAGATTG TTGTAA 

qSaSea^tiami^ 
falvhcrll 
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SEQ ID 147 D176-BB3 

J; ^~L TGC GAGCAGTTTC TTTGGCATTA GGTGCACTTA TTCAATGCTT TGACTGGCAA 
iS JSSSSE SgSagcS GGAGGAAAGC TATAATTCTA GAATGACTAT gcagaacaag 
Si cctotgaagg TTGTCTGCAC tccacgcgaa gatcttggcc agcttctatc ccaactctaa 

SSg^vsLlgaliqcfdwqieeaesleesynsrmtmqnkplkvvctpredlgqllsql 
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NAME D8 9-AB1 

ORGANISM NI COT I ANA TABACUM 

SEQ. ID. ^°^ T J^ TTCCT ^ GTCCTAAC TAAAAATGGA GATTCAGTTT TCTAACTTAG TTGCATTCTT 
61 GCTCTTTCTC TCCAGCATCT TTCTTCTATT CAAAAAATGG AAAACCAGAA AACTAAATTT 
121 GCCTCCTGGT C CAT GGAAAT TACCTTTTAT TGGAAGTTTA CACCATTTGG CTGTGGCAGG 
181 TCCACTTCCT CACCATGGCC TAAAAAATTT AGCCAAACGC TATGGTCCTC TTATGCATTT 
241 ACAACTT GGA CAAATTCCTA CACTCATCAT AT CAT CAC CT CAAAT GGCAA AAGAAGTACT 
301 AAAAACT CAC GACCTCGCTT TTGCCACTAG ACCAAAGCTT GTCGCGGCCG ACATCATTCA 
361 CTACGACAGC ACGGACATAG CATTTTCTCC GTACGGT GAA TACT GGAGAC AAATT CGTAA 
421 AATTT GCATA TTGGAACTCT TGAGTGCCAA GATGGTCAAA TTTTTTAGCT CGATTCGCCA 
481 AGATGAGCTC TCGAAGATGC TCTCATCTAT AC GAACGACA CCCAATCTTA CAGTCAATCT 
541 TACT GACAAA ATTTTTTGGT TTACGAGTTC GGTAACTTGT AGAT CAGCTT TAGGGAAGAT 
601 ATGTGGTGAC CAAGACAAAT TGATCATTTT TAT GAGGGAA ATAATATCAT TGGCAGGTGG 
661 ATTTAGTATT GCTGATTTTT TCCCTACATG GAAAAT GATT CAT GATATT G ATGGTTCGAA 
721 AT CTAAACT G GTGAAAGCAC AT CGTAAGAT TGATGAAATT TTGGGAAATG TTGTTGATGA 
781 GCACAAAAAG AACAGAGCAG AT GGCAAGAA GGGTAATGGT GAATTTGGTG GTGAAGATTT 
841 GAT T GAT GT A TTGTTAAGAG TTAGAGAAAG T GGAGAAGTT CAAATTCCTA TCACAAATGA 
901 CAATATCAAA TCAATATTAA TCGACATGTT CTCTGCAGGA T CT GAAACAT CAT CGACGAC 
961 TATAATTTGG GCATTAGCTG AAATGATGAA GAAACCAAGT GTTTTAGCAA AGGCAC AAG C 
1021 TGAAGTAAGG CAAGCTTTGA AGGAGAAAAA AGGTTTTCAA C AG AT T GAT C TTGATGAGCT 
1081 AAAAT AT CT C AAGTTAGTAA TCAAAGAAAC CTTAAGAAT G CACCCTCCAA TTCCTCTATT 
1141 AGTT CCTAGA GAATGTATGG AGGATACAAA GATTGATGGT TACAATATAC CTTTCAAAAC 
1201 AAGAGT CAT A GTTAATGCAT GGGCAATCGG ACGAGAT CCA GAAAGTTGGG ATGACCCCGA 
1261 AAGCTTTATG CCAGAGAGAT TTGAGAATAG TTCTATTGAC TTTCTTGGAA AT CAT CAT C A 
1321 GTTTATACCA TTTGGTGCAG GAAGAAGGAT TTGTCCGGGA ATGCTATTTG GTTTAGCTAA 
1381 TGTTGGACAA CCTTTAGCTC AGTTACTTTA TCACTTCGAT TGGAAACTCC CTAAT GGACA 
1441 AAGT CAT GAG AATTTCGACA TGACTGAGTC ACCT GGAATT TCTGCTACAA GAAAG GAT G A 
1501 TCTTGTTTTG ATTGCCACTC CTTATGATTC TTATTAAGCA GTAGCAGAAA TAAAAAGCCG 
1561 GGGCAAACAG AAAAAA 

seq. id. J°^jqJ snlva FLLFLS SIFL LFKKWKTRKL NLPPGPWKLP FIGSLHHIAV AGPLPHHGLK 
61 NLAKRYGPLM HLQLGQIPTL IISSPQMAKE VLKTHDLAFA TRP KLVAAD I IHYDSTDIAF 
121 SPYGEYWRQI RKICILELLS AKMVKFFSSI RQDELSKMLS SIRTTPNLTV NLTDKIFWFT 
181 SSVTCRSALG KICGDQDKLI IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
241 KIDEILGNW DEHKKNRADG KKGNGEFGGE DLI DVLLRVR ESGEVQIPIT NDNIKSILID 
301 MFSAGSETSS TTIIWALAEM MKKPSVLAKA QAEVRQALKE KKGFQQI DLD ELKYLKLVIK 
361 ETLRMHPPIP LLVPRECMED TKIDGYNIPF KTRVIVNAWA IGRDPESWDD PESFMPERFE 
421 NSSIDFLGNH HQFIPFGAGR RICPGMLFGL ANVGQPLAQL LYHFDWKLPN GQSHENFDMT 
481 ESPGI SATRK DDLVLIATPY DSY 
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NAME D89-AD2 

ORGANISM NICOTIANA TABACUM 



ID. NO TTCCTAAGTC CTAACTAAAA ATGGAGATTC AGTTTTCTAA CTTAGTTGCA 

« TTrTTGCTCT TTCTCTCCAG CATCTTTCTT CTATTCAAAA AATGGAAAAC CAGAAAACTA 
iS SSSScctc Sgctccatg GAAATTACCT TTTATTGGAA GTTTACACCA TTTGGCTGTG 
i fli ^ScSc ttcctcacca Ssgcctaaaa AATTTAGCCA AACGCTATGG tcctcttatg 

IK cSSacSc SgScAAAT icCTACACTC ATCATATCAT CACCTCAAAT GGCAAAAGAA 

lol Sactaaaaa ctcacgacct cgcttttgcc actagaccaa AGCTTGTCGT GGCCGACATC 
3 3°ei aXgcacgga catagcattt JCTCCGTACG ? tgaa ? actg «™»»t 

421 CGTAAAATTT GCATATTGGA ACTCTTGAGT GCCAAGATGG TCAAATTTTT ^AGCTCGATT 
4 81 CGCCAAGATG AGCTCTCGAA GATGCTCTCA TCTATACGAA CGACACCCAA T CTT AC AGT C 
J5l AATCTTACTG ACAAAATTTT TTGGTTTACG AGTTCGGTAA CTTGTAGATC AGCTTTAGGG 

IS SSa^caaS caaattgatc atttttatga gggaaataat atcattggca 

IV rrTGGATTTA CTATTGCTGA TTTTTTCCCT ACATGGAAAA TGATTCATGA TATTGATGGT 

?2i SSSSSS aaSggSgaa agcacatcgt aagattgatg aaattttggg aaatgttgtt 
7ri gSgagSca aaaagaacag agcagatggc aagaagggta atggtgaatt tggtggtgaa 
IV SSSSSSs atgtattgtt aagagttaga gaaagtggag aagttcaaat tcctatcaca 
SJSSSSa tcaaatcSS attaatcgac atgttctctg cgggatctga aacatcatcg 
IV; a^SSS ?S5g"gca^ agctgaaatg atgaagaaac caagtgtttt agcaaaggca 
1021 caSSSag* ?aaggcSgc ? T tgaaggag aaaaaaggtt ttcaacagat tgatcttgat 
loll SSSaaaat atctcaagtt agtaatcaaa gaaaccttaa gaatgcaccc tccaattcct 
SaSagttc ctagagaatg tatggaggat acaaagattg atggttacaa tatacctttc 
ilol aaaacaagag tcatagttaa tgcatgggca atcggacgag atccagaaag ttgggatgac 
llll c^3£S? tStgccaga gagatttgag aatagttcta ttgactttct tggaaatcat 

llll CATCAGTTTA TACCATTTGG TGCAGGAAGA AGGATTTGTC CGGGAATGCT ATTTGGTTTA 

llll StS?gttg gacaaccttt agctcagtta ctttatcact tcgattggaa actccctaat 

\aV rr ACAAAGTC ATGAGAATTT CGACAT GACT GAGTCACCTG GAATTTCTGC TACAAGAAAG 

llol SSg ??SS£tgc cactccttat gattcttatt aagcagtagc agaaataaaa 
1561 AGCCGGGGCA aacagaaaaa a 

SKQ. ID. NO "2 FLLFLSSIFL L FKKWKT RKL NLPPGPWKLP FIGSLHHLAV AGPLPHHGLK 

61 NLAKRYGPIM HLQLGQIPTL IISSPQMAKE VLKTHDLAFA TRPKLWADI IHYDSTDIAF 
121 SP^EWROI RKICILELLS AKMVKFFSSI RQDELSKMLS SIRTTPNLTV NLTDKIFWFT 
III ?swcr!Elg KICGDQDKLI IFMREIISIA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
III ShSradg kkgngefgge dlidvllrvr esgevqipit NDNIKSILID 

301 SsAGSfSs TTIIWALAEM MKKPSVIAKA QAEVRQALKE KKGFQQI DLD ELKYLKLVIK 

III SlrSSfip lIvprecmed tkidgynipf ktrvivnawa igrdpeswdd pesfmperfe 
111 nSSflSh hqfipfgagr ricpgmlfgl anvgqplaql lyhfdwklpn gqshenfdmt 

481 ESPGISATRK DDLVLIATPY DSY 
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NAME D90A-BB3 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. J°^^ GCAGT TTGAAGAT AC CAACTAACCA AAATGCAGTT CTTCAGCTTG GTTTCCATTT 
61 TCCTATTTCT ATCTTTTCTC TTTTTGTTAA GGAAATGGAA GAACT CGAAT AGCCAAAGGA 
121 AAAAATTGCC ACCAGGTCCA TGGAAACTAC CAATACTAGG AAGTAT GCTT CAT AT GGTT G 
181 GTGGACTACC ACACCATGTC CTTAGAGATT TAGCCAAAAA ATATGGACCG CTTATGCACC 

2 41 TTCAATTAGG TGAAGTTTCT GCAGTTGTGG TTACTTCTCC TGATATGGCA AAAGAAGTAC 

3 01 TAAAAACTCA TGACATCGCT TTCGCGTCTA GGCCTAGCCT TTTGGCCCCG GAGATTGTCT 

3 61 GT TACAAT AG GTCTGATCTT GCGTTTTGCC CCTATGGCGA TTATT GGAGA CAAATGCGTA 
421 AAAT AT GT GT CTTGGAAGTG CTCAGTGCCA AGAAT GTT CG GACATATAGC TCTATTAGGC 

4 81 GCGATGAAGT TCTTCGTCTC CTTAATTTTA TCCGGTCATC TTCTGGTGAG CCT GTTAATA 

5 41 TTAC GGAAAG GATCTTTTTG TTCACAAGCT C CAT G AC AT G TAGAT CAGCG TTTGGGCAAG 
601 TATT CAAGGA GCAAGACAAA T T T AT ACAAC TAATTAAAGA AGTTATACTC TTAGCAGGAG 
661 GGTTTGATGT GGCT GACAT A TTCCCTTCAT ACAAGTCTCT TCATGTGCTC AGTGGAATGA 
721 AGGGTAAGAT TATGAATGCA CACCATAAGG TAGAT GCTAT T GTT GAGAAT GT CAT C AAC G 

7 81 AGCACAAGAA AAATCTTGCA ATTGGGAAAA CTAAT GGAGC GTTAGGAGGT GAAGATTTAA 

8 41 TTGATGTTCT TCTAAAACTT ATGAATGATG GAGGCCTTCA ATTTCCTATC AC C AAC G AC A 

9 01 ACAT CAAAGC TATAATCTTT GACAT GTTTG CTGCTGGAAC AGAGACTTCA T CGT CAACAA 
9 61 TTGTGTGGGC TATGGTGGAA AT GGT GAAAA AT CCAACT GT ATTTGCGAAA GCT CAAGCAG 

1021 AAGTAAGAGA TGCATTTAGA GAAAAAGAAA CTTTTGATGA AAAT GAT GT G GAGGAGCTAA 
10 81 ACT AT CTAAA GTTAGTCATT AAAGAAACTC TAAGACTTCA TCCACCGGTT CCACTTTTGC 
1141 T C C CAAG AG A AT GTAGGGAA GAGACAAATA TAAACGGCTA CACTATTCCT GTAAAGACCA 
12 01 AAGT CAT GGT TAATGTTTGG GCATTGGGAA GAGAT CCAAA AT ATT GGGAT GAT GCAGAAA 

12 61 CTTTTAAGCC AGAGAGATTT GAGCAGT GCT CTAAGGATTT T GTT GGTAAT AATTTT GAAT 
1321 ATCTTCCATT TGGTGGTGGA AGGAGGATTT GT C CAGGGAT TTCGTTTGGT TTAGCTAATG 

13 81 CTTATTTGCC ATT GGCT CAA TTACTTTATC ACTTTGATTG GGAACTCCCC ACT G GAAT CA 

14 41 AACCAAGCGA CTTGGACTTG ACTGAGTTGG TTGGAGTAAC TGCCGCTAGA AAAAGTGACC 
1501 TTTACTTGGT TGCGACTCCT TAT CAACCT C CTCAAAAC 



SEQ. ID. NO. 154 

1 MQFFSLVSIF LFLSFLFLLR KWKNSNSQRK KLPPGPWKLP ILGSMLHMVG GLPHHVLRDL 
61 AKKYGPLMHL QLGEVSAVW TSPDMAKEVL KTHDIAFASR PSLLAPEIVC YNRSDLAFCP 
121 YGDYWRQMRK ICVLEVLSAK NVRTYSSIRR DEVLRLLNFI RSSSGEPVNI TERIFLFTSS 
181 MTCRSAFGQV FKEQDKFIQL IKEVILLAGG FDVADIFPSY KSLHVLSGMK GKIMNAHHKV 
241 DAIVENVINE HKKNLAIGKT NGALGGEDLI DVLLKLMNDG GLQFPITNDN IKAIIFDMFA 
3 01 AGTETSSSTI VWAMVEMVKN PTVFAKAQAE VRDAFREKET FDENDVEELN YLKLVIKETL 

3 61 RLHPPVPLLL PRECREETNI NGYTIPVKTK VMVNVWALGR DPKYWDDAET FKPERFEQCS 
421 KDFVGNNFEY LPFGGGRRIC PGISFGLANA YLPLAQLLYH FDWELPTGIK PSDLDLTELV 

4 81 GVTAARKSDL YLVATPYQPP QN 
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NAME D95-AG1 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO^J^J atgtc TTCATTTTCC acaTCTTCTG CCACTTCTAA TTCCAAACTT CCAGTTCGAG 
61 AAATCCCAGG AGACTAT GGT TTCCCCTTTT TTGGAGCCAT AAAAGATAGA TATGACTACT 
121 TCTACAACCT CGGCACAGAC GAATTCTTTC TTACCAAAAT GCAAAAATAC AACTCTACTG 
181 TCTTTAGAAC CAACATGCCA CCAGGTCCAT TCATTGCTAA AAATCCCAAA GTAATTGTTC 
241 TCCTCGATGC CAAAACATTT CCCGTTCTTT TCGACAACTC TAAAGTCGAA AAAATGAACG 
301 TTCTTGATGG CACGTACGTG CCATCTACTG ATTTCTATGG CGGATATCGC CCGTGTGCTT 
361 ATCTTGATCC TTCTGAGTCA ACTCATGCCA CACTTAAAGG GTTCTTTTTA T CTTTAATCT 
421 CCCAGCTTCA TAATCAATTT ATTCCTTTAT TTAGAACCTC AATTTCTGGT CTTTTCGCAA 
481 ATCTT GAGAA TGAGATTTCC CAAAATGGCA AAGCGAACTT CAACAATATC AGCGACATTA 
541 TGTCATTCGA TTTTGTTTTT CGTTT GTTAT GTGACAAGAC CAGTCCCCAT GACACAAATC 
601 TTGGCTCTAA TGGACCAAAA CTCTTTGATA TATGGCTGTT GCCTCAACTT GCTCCATTGT 
661 TTAGTCTAGG TCTAAAATTT GTGCCGAACT TTCTGGAAGA TTTAATGTTG CATACTTTTC 
721 CCTTGCCATT TTTTCTAGTG AGATCGAATT ACCAGAAGCT TTATGATGCT TTTAGCAAGC 

7 81 ATGCCGAAAG TAC ACT GAAT GAAGCAGAGA AGAAT GGGAT CAAAAGAGAC GAAGCATGCC 

8 41 ACAACTTAGT TTTTCTTGCA GGTTTCAATG CTTATGGTGG GATGAAAGTT TTATTCCCTG 
901 C ACT GAT AAA GTGGGTCGCC AATGGAGGAA AGAGTT TACA CACTCGGCTG GCAAATGAAA 
961 TCAGGACAAT TAT C AAAGAA GAAT GTGGGA CCATAACTCT ATCAGCAATC AACAAGATGA 

1021 GTTTAGTAAA AT CAGTAGTG TATGAAGTAT TAAGAATTGA ACCTCCAGTT CCATT CCAAT 
1081 AT GGTAAAGC CAAAGAAGAT ATCATAATCC AAAGCCATGA TTCAACTTTC TTAGTCAAGA 
1141 AAGGT GAAAT GATCTTTGGA TATCAGCCTT TTGCTACAAA AGATCCAAAG ATTTTTGACA 
1201 AACCAGAGGA GTTTATTCCG GAGAGGTTCA TGGCCGAAGG GGAAAAATTA TTAAAGTATG 
1261 TGTATTGGTC AAAT GCAAGA GAGACAGATG ATCCAACGGT GGACAACAAA CAATGCCCAG 
1321 CGAAAAATCT TGTCGTGCTT TTGTGCAGGT TGATGTTGGT GGAGGTTTTC ATGCGTTACG 
1381 ACACATTCAC AGTGGAGTCA ACAAAGCTCT TTCTTGGGTC ATCAGTAACG TTCACGACTC 
1441 TGGAAAAAGC GACAT GAGTT TCAGATATCT TAATTGTAGG CTGCAAATAA TAATGTGGTC 
1501 ATTCTGCAAA TTATTGTACT TGTGCTGATG 

SEQ. ID. *°^ s ^ tssat SNSKLpVREI pgdyGFPFFG AIKDRYDYFY NLGTDEFFLT KMQKYNSTVF 
61 RTNMPPGPFI AKNPKVIVLL DAKTFPVLFD NSKVEKMNVL DGTYVP STDF YGGYRPCAYL 
121 DPSESTHATL KGFFLSLISQ LHNQFIPLFR TSISGLFANL ENEISQNGKA NFNNISDIMS 
181 FDFVFRLLCD KTSPHDTNLG SNGPKLFDIW LLPQLAPLFS LGLKFVPNFL EDLMLHTFPL 
241 PFFLVRSNYQ KLYDAFSKHA ESTLNEAEKN GIKRDEACHN LVFLAG FNAY GGMKVLFPAL 
301 IKWVANGGKS LHTRLANEIR TIIKEECGTI TLSAINKMSL VKSWYEVLR IEPPVPFQYG 
361 KAKEDIIIQS HDSTFLVKKG EMIFGYQPFA TKDPKIFDKP EEFI PERFMA EGEKLLKYVY 
421 WSNARETDDP TVDNKQCPAK NLWLLCRLM LVEVFMRYDT FTVESTKLFL GSSVTFTTLE 
481 KAT 
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NAME D96-AB6 

ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 157 

1 CCAAAAATGG AGCTTCAATC TTCTCCTTTC AATTTAATTT CTTTGTTCCT CTTCTTTTCT 
61 TTT CATTTTA TTCTAGTGAA GAAATGGAAT GCCAAAATCC CAAAGTTACC TCCAGGTCCG 
121 TGGAGGCTTC CCTTTATTGG AAGCCTCCAT CACTT GAAGG GAAAACTTCC ACACCATAAT 
181 CTTAGAGATC TAGCGCGAAA ATATGGGCCT CTCATGTACT TACAACT CGG AGAAATTCCT 
241 GTAGTTGTAA TATCTTCGCC ACGTGTAGCA AAAGCT GTAC TAAAAACTCA TGATCTCGCT 
301 TTTGCAACTA GAC CACGATT CATGTCCTCA GACATTGTGT TTTACAAAAG CAGGGACATC 
361 TCTTTTGCCC CATTTGGTGA TT ACT GGAGA CAGAT GCGTA AAATATT GAC TCAGGAACTC 
421 CTGAGTAACA AGATGCTCAA GTCATATAGC TTAATCCGAA AGGAT GAGCT CTCGAAGCTC 
481 CTCTCATCGA TTCGTTTGGA AACAGGTTCT GCAGT GAACA TAAATGAAAA GCTTCTCTGG 
541 TTTACGAGCT GCATGACCTG TAGATTAGCC TTTGGAAAAA TATGCAATGA TCGGGATGAG 
601 T T GAT CAT GC TAATTAGGGA GATATTAACA TTAT CAGGAG GATTTGATGT GGGTGATTTG 
661 TTCCCTTCCT GGAAATTACT T CAT AAT AT G AGCAACATGA AAGCTAGGTT GACGAAT GTA 
721 CACCACAAGT AT GATTTAGT TAT GGAGAAC AT CAT C AAT G AGCAC CAAGA GAAT CAT GCA 
781 GCAGGGATAA AGGGTAACAA CGAGTTTGGT GGC GAAGAT A TGATCGATGC TCTACT GAGG 
841 GCTAAGGAGA AT AAT GAGCT TCAATTTCCT ATCGAAAATG ACAACATGAA AGCAGTAATT 
901 CTGGACTTGT TTATTGCTGG AACT GAAACT TCATATACTG CAATTATATG GGCACTAT CA 
961 GAATTGATGA AGCACCCAAG TGTGATGGCC AAGGCACAAG CT GAAGT GAG AAAAGTCTTC 
1021 AAAGAAAATG AAAATTTCGA CGAAAATGAT * CTT GACAAGT TGCCATACTT AAAATCAGTG 
1081 ATTAAAGAAA CACTAAGGAT GCACCCTCCA GTTCCTTTGT TAGGGCCTAG AGAAT GCAGG 
1141 GACCAAACAG AGATCGATGG CTACACTGTA CCTATTAAAG CTAGAGTTAT GGTTAATGCT 
1201 TGGGCGATAG GAAGAGATCC TGAAAGTTGG GAAGAT CCTG AAAGTTTCAA ACCGGAGCGA 
1261 TTTGAAAATA CTTCTGTTGA TCTTACAGGA AAT CACTAT C AGTTCATTCC TTTCGGTTCA 
1321 GGAAGAAGAA TGTGTCCAGG AATGTCGTTT GGTTTAGTTA AC ACAGGG CA TCCTTTAGCC 
1381 CAGTTGCTCT ATTGCTTTGA CTGGAAACTC CCTGACAAGG TTAAT GCAAA TGATTTTCGC 
1441 ACTACT GAAA CAAGTAGAGT TTTTGCAGCA AG C AAAGAT G ACCTCTACTT GAT T C C C AC A 
1501 AAT C AC AG G G AGCAAGAATA GCTTAATTTA AT GGAGTT CT TGGAAGAATT AAAGAAGAAG 
1561 GGCTATATAG GTGAGATTTT TTGTATGGTT GCA 

SEQ. ID. NO. 158 

1 MELQSSPFNL ISLFLFFSFH FILVKKWNAK IPKLPPGPWR LPFIGSLHHL KGKLPHHNLR 
61 DLARKYGPLM YLQLGEIPW VISSPRVAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APFGDYWRQM RKILTQELLS NKMLKSYSLI RKDELSKLLS SIRLETGSAV NINEKLLWFT 
181 SCMTCRLAFG KICNDRDELI MLIREILTLS GGFDVGDLFP SWKLLHNMSN MKARLTNVHH 
241 KYDLVMENII NEHQENHAAG IKGNNEFGGE DMIDALLRAK ENNELQFPIE NDNMKAVILD 
301 LFIAGTETSY TAIIWALSEL MKHP S VMAKA QAEVRKVFKE NENFDENDLD KLPYLKSVIK 
361 ETLRMHPPVP LLGPRECRDQ TEIDGYTVPI KARVMVNAWA IGRDPESWED PESFKPERFE 
421 NTSVDLTGNH YQFIPFGSGR RMCPGMSFGL VNTGHPLAQL LYCFDWKLPD KVNANDFRTT 
481 ETSRVFAASK DDLYLIPTNH REQE 
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NAME D96-AC2 

ORGANISM NICOTIANA TABACUM 

SEQ. ID. ~°CTTCTTCC»A AAATGGAGCT TCAATCTTCT CCTTTCAATT TAATTTCTTT GTTCCTCTTC 
61 TTTTCTTTTC TTTTTATTCT AGTGAAGAAA TGGAATGCCA AAATCCCAAA GTTACCTCCA 
121 GGTCCGTGGA GGCTTCCCTT TATTGGAAGC CTCCATCACT TGAAGGGAAA ACTTCCACAC 
181 CATAATCTTA GAGAT CTAGC GCGAAAATAT GGACCTCTCA TGTACTTACA ACT CGGAGAA 
241 ATTCCT GTAG TTGTAATATC TTCGCCACGT GTAGCAAAAG CTGTACTAAA AACT CAT GAT 
301 CTCGCTTTTG CAACTAGACC ACGATTCATG TCCTCAGACA TTGTGTTTTA CAAAAGCAGG 
361 GACAT CTCTT TTGCCCCATT TGGTGATTAC TGGAGACAGA TGCGTAAAAT ATTGACTCAG 
421 GAACTCCTGA GTAACAAGAT GCTCAAGTCA TAT AGCT TAA T CCGAAAGGA TGAGCTCTCG 
481 AAGCTCCTCT CATCGATTCG TTTGGAAACA GGTTCTGCAG TGAACATAAA TGAAAAGCTT 
541 CTCTGGTTTA CGAGCTGCAT GACCT GTAGA TTAGCCTTTG GAAAAATATG CAATGATCGG 
601 GATGAGTTGA TCATGCTAAT TAGGGAGATA TTAACATTAT CAGGAGGATT TGATGTGGGT 
661 GATTT GTTCC CTTCCTGGAA ATTACTTCAT AATATGAGCA ACATGAAAGC TAGGTTGACG 
721 AAT GTACACC ACAAGTATGA TTTAGTTATG GAGAACATCA TCAATGAGCA CCAAGAGAAT 
781 CAT GCAGCAG GGATAAAGGG TAACAACGAG TTTGGTGGCG AAGAT AT GAT CGATGCTCTA 
841 CTGAGGGCTA AGGAGAATAA TGAGCTTCAA TTTCCTATCG AAAATGACAA CATGAAAGCA. 
901 GTAATTCTGG ACTT GTTTAT TGCTGGAACT GAAACTT CAT AT ACT GCAAT TATATGGGCA 
961 CTAT CAGAAT T GAT GAAGCA CCCAAGTGTG ATGGCCAAGG CACAAGCTGA AGT GAGAAAA 
1021 GTCTTCAAAG AAAATGAAAA TTTCGACGAA AAT GAT CT T G ACAAGTT GCC ATACTTAAAA 
1081 TCAGTGATTA AAGAAACACT AAGGATGCAC CCTCCAGTTC CTTTGTTAGG GCCTAGAGAA 
1141 TGCAGGGACC AAACAGAGAT CGATGGCTAC ACT GTACCT A TTAAAGCTAG AGTTATGGTT 
1201 AATGCTTGGG CGATAGGAAG AGATCCTGAA AGTTGGGAAG ATCCTGAAAG TTTCAAACCG 
1261 GAGCGATTTG AAAAT ACTT C TGTTGATCTT ACAGGAAATC ACTATCAGTT CATTCCTTTC 
1321 GGTTCAGGAA GAAGAAT GT G TCCAGGAATG TCGTTTGGTT TAGTTAACAC AGGGCATCCT 
1381 TTAGCCCAGT TGCTCTATTG CTTTGACTGG AAACTCCCTG ACAAGGTTAA T GCAAAT GAT 
1441 TTTCGCACTA CT GAAACAAG TAGAGTTTTT GCAGCAAGCA AAGATGACCT CTACTTGATT 
1501 CCCACAAATC ACAGGGAGCA AGAATAGCTT AATTTAATGG AGTTCTTGGA AGAATTAAAG 
1561 AAGAAGGGCT ATATAGGTGA GATTTTTTGT ATGGTTGCA 

SEQ. ID. NO. ISLFLFFSFL FiLVKKWNAK IPKLPPGPWR LPFIGSLHHL KGKLPHHNLR 

61 DLARKYGPLM YLQLGEIPW VISSPRVAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APFGDYWRQM RKILTQELLS NKMLKSYSLI RKDELSKLLS S I RLETGSAV NINEKLLWFT 
181 SCMTCRLAFG KICNDRDELI MLIREILTLS GGFDVGDLFP SWKLLHNMSN MKARLTNVHH 
241 KYDLVMENII NEHQENHAAG IKGNNEFGGE DMI DALLRAK ENNELQFPIE NDNMKAVILD 
301 L FI AGT ET S Y TAIIWALSEL MKHPSVMAKA QAEVRKVFKE NENFDENDLD KLPYLKSVIK 
361 ETLRMHPPVP LLGPRECRDQ TEIDGYTVPI KARVMVNAWA IGRDPESWED PESFKPERFE 
421 NTSVDLTGNH YQFIPFGSGR RMCPGMSFGL VNTGHPIAQL LYCFDWKLPD KVNANDFRTT 
481 ETSRVFAASK DDLYLIPTNH REQE 
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NAME D98-AA1 

ORGANISM NICOTIANA TABACUM 

SEQ. ID. ^ 0 ^ T ^J TTTCT TGTACCGAGA TGGAGTTTCA ACACTTGGTT TCGTTCTTGC TATTCATCTC 
61 CTTCATCTTT CTTCTAATTC AAAAATGGAG GAAATCGAAA AAGCTGCCAC CTGGTCCGTG 
121 GAGGCTACCT ATTATTGGAA GTGTGCATCA CTTGACAAGT GGAGTACCAC ATCGAGTTCT 
181 CAGAAATTTA TCACAAAAAT TTGGCCCGAT CAT GT ACT T G CAGCTCGGGG AAGTTCCCAC 
241 AGTAGTT GTA TCCTCCCCAC ACAT GGCCAA ACAAATTTTA AAAACT CAT G ACCTCGCTTT 
301 TGCATCTAGG CCAGAAATCA TGATGGGAAA AATTATTTGC TACGATTGTA AGGACATTGC 
361 CTTTTCCCCG TATGGT GATT ATTGGAGACA TATGCGTAAA TTGAGCACCT TGGAACTACT 
421 TAGT GCCAAG ATGGTCAAGT CCTTCAGTCC AATT CGTCAA GATGAGCTCT CAAGTCTCCT 
4 81 ATCATCCATT GAATCAATGG GAAATTTGCC AATCAACTTA GTAGAAAAAC TTTTATGGTT 
541 TATGAATGCC GCGACATGTA GGT CAGCATT TGGGAAAGTG TGTAAAGATC AAAAAGAGTT 
601 GATAACATTG ATTCAACGAG CAGAATCATT ATCTGGTGGA TTCGAGCTGG CTGATTTGTT 
661 CCCTTCGAAG AAGTTTCTAC ATGGTATTAG TGGGATGCGA TCTAAACTAA TGGAAGCTCG 
721 TAACAAGATA GACGCAGTCT TGGACAACAT TATCAATGTG CACAGAGAGA AT CGGGCAAA 
7 81 TGGAAATAGT TGTAATGGTG AGTCTGGAAC TGTAGATTTC ATCGATGTTT TT CTAAGGGT 
841 CATGGAGAGT GGCGAATTAC CATTTCCGAT AGAAAATGAC AACATCAAAG CAGTTATTCT 
901 TGACATGTTC GTAGCAGGAT CTGACACATC ATCTTCAACC GTTATTTGGG CATTAACAGA 
961 AATGATGAAG AATCCAAAAG TCATGGCTAA AGCACAAGCT GAAGT GAGAG AAGCTTTTAA 
1021 AGGAAAGAAA GCATGTGATG AGGATACTGA TCTTGAAAAG CTTCATTACC TAAATTTAGT 
1081 GATCAAAGAG ACACT CCGAT TACACCCTCC AACTCCTCTA CTTGTCCCGC GAGAAT GCAG 
1141 GGAGGAAACA GAGATAGAAG GATTCACTAT ACCATTGAAA AGCAAAGTCT TGGTTAACGT 
1201 ATGGGCAATT GGAAGAGATC CCGAGAATTG GAAAAATCCT GAATGTTTTA TACCAGAGAG 
12 61 ATT C GAAAAT AGTTCTATTG AGTTTACTGG AAATCATTTT CAACTTCTTC CGTTTGGCGC 
1321 TGGAAGACGA ATTTGTCCAG GAAT GCAATT TGGTTTGGCT CTTGTTACTC TGCCATTGGC 
1381 TCATTTGCTT CACAATTTTG ATTGGAAACT TCCCGAAGGA ATTAATGCAA GGGATTTGGA 
1441 CATGACAGAG GCAAATGGGA TATCTGCTAG AAGAGAAAAA GATCTTTACT TGATTGCTAC 
1501 TCCTTATGTA TCACCTCTTG ATTAACTCTG AAATTTTGCT TTAATGCTGC TTGCTTGCTT 
1561 CACT 

SEQ. ID. NO^lg^^ LFISFIFLLI QKWRKSKKLP PGPWRLPIIG SVHHLTSGVP HRVLRNLSQK 
61 FGPIMYLQLG EVPTVWSSP HMAKQILKTH DLAFASRPEI MMGKI ICYDC KDIAFSPYGD 
121 YWRHMRKLST LELLSAKMVK SFSPIRQDEL SSLLSSIESM GNLPINLVEK LLWFMNAATC 
181 RSAFGKVCKD QKELITLIQR AESLSGGFEL ADLFPSKKFL HGISGMRSKL MEARNKIDAV 
241 LDNI INVHRE NRANGNSCNG ESGTVDFIDV FLRYMES GEL PFPIENDNIK AVI LDMFVAG 
301 SDTSSSTVIW ALTEMMKNPK VMAKAQAEVR EAFKGKKACD EDTDLEKLHY LNLVIKETLR 
361 LHPPTPLLVP RECREETEIE GFTIPLKSKV LVNWAIGRD PENWKNPECF IPERFENSSI 
421 EFTGNHFQLL P FGAGRRI CP GMQFGLALVT LPIAHLLHNF DWKLPEGINA RDLDMTEANG 
481 ISARREKDLY LIATPYVSPL D 
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NAME D98-AG1 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 163 

1 CTTTCTTGTA CCGAGATGGA GTTTCAACAC TTGGTTTCGT TCTTGCTATT CATCTCCTTC 
61 ATCTTTCTTC TAATT CAAAA AT GGAGGAAA TCGAAAAAGC TGCCACCTGG TCCGTGGAGG 
121 CT AC CTATTA TTGGAAGTGT GCATCACTTG ACAAGT GGAG T AC CACAT C G AGTTCTCAGA 
181 AATTTAT CAC AAAAATTTGG CCCGATCATG TACTTGCAGC TCGGGGAAGT TCCCACAGTA 
241 GTTGTATCCT CCCCACACAT GGC CAAACAA ATTTTAAAAA CTCATGACCT CGCTTTTGCA 
301 TCTAGGCCAG AAAT CAT GAT GGGAAAAATT ATTTGCTACG ATTGTAAGGA CATTGCCTTT 
361 TCCCCGTATG GTGATTATTG G AG ACAT AT G CGTAAATTGA GCACCTTGGA ACT ACT T AGT 
421 GCCAAGATGG TCAAGTCCTT CAGTCCAATT CGTCAAGATG AGCTCTCAAG T CT C CT AT C A 
481 TCCATTGAAT CAAT GGGAAA TTTGCCAATC AACTTAGTAG AAAAACTTTT ATGGTTTATG 
541 AATGCCGCGA CAT GT AGGTC AGCATTTGGG AAAGTGTGTA AAGATCAAAA AGAGTT GATA 
601 ACATTGATTC AACGAGCAGA AT CATTAT CT GGTGGATTCG AGCTGGCTGA TTTGTTCCCT 
661 TCGAAGAAGT TTCTACATGG TATTAGT GGG ATGCGATCTA AACTAAT GGA AGCT CGTAAC 
721 AAGATAGACG CAGTCTTGGA CAACATTATC AATGTGCACA GAGAGAATCG GGCAAAT GGA 
781 AATAGTT GTA ATGGTGAGTC TGGAACTGTA GATTTCATCG ATGTTTTTCT AAGGGT CAT G 
841 GAGAGT GGCG AATTAC CATT TCCGATAGAA AAT GACAACA TCAAAGCAGT TATTCTTGAC 
901 ATGTTCGTAG CAGGAT CTGA CACAT CAT CT TCAACCGTTA TTTGGGCATT AACAGAAACG 
961 AT G AAGAAT C CAAAAGT CAT GGCTAAAGCA CAAGCT GAAG TGAGAGAAGC TTTTAAAGGA 
1021 AAGAAAGCAT GTGATGAGGA TACT GAT CT T GAAAAGCAT C ATTACCTAAA TTTAGT GAT C 
1081 AAAGAGACAC TCCGATTACA CCCTCCAACT CCTCTACTTG TCCCGCGAGA AT GCAGGGAG 
1141 GAAACAGAGA TAGAAGGATT CACTATACCA TTGAAAAGCA AAGTCTTGGT TAACGTATGG 
1201 GCAATT GGAA GAGATCCCGA GAATTGGAAA AATCCTGAAT GTTTTATACC AGAGAGATTC 
1261 GAAAATAGTT CTATT GAGTT TACT GGAAAT CATTTTCAAC TTCTTCCGTT TGGCGCTGGA 
1321 AGACGAATTT GTCCAGGAAT GCAATTTGGT TTGGCTCTTG TTACTCTGCC ATTGGCTCAT 
1381 TTGCTTCACA ATTTTGATTG GAAACTTCCC GAAGGAATTA AT GCAAGGGA TTTGGACATG 
1441 ACAGAGGCAA ATGGGATATC TGCTAGAAGA GAAAAAGATC TTTACTT GAT TGCTACTCCT 
1501 TAT GT AT CAC CTCTTGATTA ACT CT GAAAT TTTGCTTTAA TGCTGCTTGC TTGCTTCACT 

SEQ. ID. NO. 164 

1 MEFQHLVSFL LFISFIFLLI QKWRKSKKLP PGPWRLPIIG SVHHLTSGVP HRVLRNLSQK 
61 FGPIMYLQLG EVPTVWSSP HMAKQILKTH DLAFASRPEI MMGKIICYDC KDIAFSPYGD 
121 YWRHMRKL5T LELLSAKMVK SFSPIRQDEL SSLLSSIESM GNLPINLVEK LLWFMNAATC 
181 RSAFGKVCKD QKELITLIQR AESLSGGFEL ADLFPSKKFL HGISGMRSKL MEARNKIDAV 
241 LDNIINVHRE NRANGNSCNG ESGTVDFIDV FLRVMESGEL PFPIENDNIK AVI LDMFVAG 
301 SDTSSSTVIW ALTETMKNPK VMAKAQAEVR EAFKGKKACD EDTDLEKHHY LNLVIKETLR 
361 LHPPTPLLVP RECREETEIE GFTIPLKSKV LVNVWAI GRD PENWKNPECF IPERFENSSI 
421 EFTGNHFQLL PFGAGRRICP GMQFGLALVT LPLAHLLHNF DWKLPEGINA RDLDMTEANG 
481 I S ARREKDLY LIATPYVSPL D 
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NAME D100-BE2 
ORGANISM N I COT I AN A TABACUM 

SEQ. ID. NO. 165 

1 CAAAAACAAA ATTCCAATGG TTAACATGTT CACTCCAATT ATATACGCTC CTCTCCTTTT 
61 AGCTTTTTAC ATTATCACAA AACATTTCTT AC GCAAACT C AGAAATAATC CACCAGCTCC 
121 ATTTCTTACT TTCCCCTTTA TTGGCCATCT TTATCTCTTC AAAAAACCAC TTCAACGTAC 
181 CTTAGCCAAA ATCTCCGAAC GTTAT GGCTC TGTTCTTCTA CTCGAATTCG GTT CACGAAA 
241 AGTACTTTTG GTTTCTTCAC CAT CT GCAGC TGAAGAATGC TTAACAAAAA ACGATATTAT 
301 TTTCGCGAAT CGTCCTCTTT TGATGGCTGG AAAACAT CTT GGATATAATT TTACATCTTT 
361 GGCTTGGAGT TCGTACGGAG AT CAT T GGAG AAATCTGCGA AGGATTACTT CAGTT GAGAT 
421 GTTTTCGACT CATCGTCTTC AAAT GCTACA TGGGATTCGT ATT GAT GAAG TGAAATCTAT 
4 81 GGTTAAGAGG CTCAATTCCT CT GCCATAGC TGAAAAATCT GTGGATATGA AGTCTATGTT 
541 TTTTGAGCTG ATGCTCAATG TT AT GAT GAG GACAATT GCT GGAAAAAGAT ATTACGGTGA 
601 GAATGTGGAG GACATT GAGG AAGCTACGAG ATT CAAAGGT TTGGTGCAAG AGACT TT CAG 
661 GATTGGCGGG GCGAC GAAT A TTGGCGACTT TTTGCCGGCG TT GAAGTT AT TGGTGAGGAA 
721 ATT GGAGAAA AGTTTAATTG TGTTGCAAGA GAACAGAGAT GAGTTTATGC AGGAATTAAT 
781 TAAAGATTGC AGAAAAAGAA TGGAGAAAGA AGGTACT GTT ACT GAT T CAG AAATT GAAGG 
841 GAACAAGAAA TGTTTAATTG AAGTTTTGTT AACACTACAA GAAAATGAAC CGGAATACTA 
901 CAAAGATGAA AT CAT C AG AA GCCTTATGCT . TGTTCTATTA TCAGCTGGTA CAGATACTTC 
961 AGTTGGGACA ATGGAATGGG CTTTATCATT AAT GTTAAAC CACCCT GAAA CTCTGAAGAA 
1021 AGCACAAGCT G AAAT T GAT G AACATATAGG ACATGAACGT TTAGTGGACG AGTCGGACAT 
1081 CAACAACCTA CCTTACCTAC GTT GT AT AAT CAACGAGACA TTCCGAATGT ACCCT GCAGG 
1141 ACCACTACTA GTCCCACACG AGT CGTCAGA GGAAAC CACC GTAGGAGGCT ACCGTGTACC 
1201 CGGAGGAAC C AT GTTACTT G TGAATTTGTG GGCAATTCAC AAT GAT C C AA AGCTAT GGGA 
12 61 TGAACCAAGA AAGTTTAAAC CAGAAAGATT TCAAGGACTA GAT GGT GTT A GAGAT GGT T A 
1321 CAAAATGATG CCTTTTGGTT CTGGACGAAG GAGTTGTCCT GGAGAAGGAT TGGCTGTTCG 
1381 AATGGTTGCC TTGTCATTGG GAT GTATTAT TCAATGTTTT GATTGGCAAC GAATCGGCGA 
1441 AGAATTGGTT GATATGACTG AAGGAACTGG ACTTACTTTG CCTAAAGCTC AACCTTTGGT 
1501 GGCCAAGTGT AGCCCACGAC CTAAAATGGC TAATCTTCTC TCTCAGATTT GA 

SEQ. ID. NO. 166 

1 MVNMFTPIIY APLLLAFYII TKHFLRKLRN NPPAPFLTFP FIGHLYLFKK PLQRTLAKIS 

61 ERYGSVLLLE FGSRKVLLVS SPSAAEECLT KNDIIFANRP LLMAGKHLGY NFTSLAWSSY 

121 GDHWRNLRRI TSVEHFSTHR LQMLHGIRID EVKSMVKRLN SSAIAEKSVD MKSMFFELML 

181 NVMMRT I AGK RYYGENVEDI EEATRFKGLV QETFRI GGAT NIGDFLPALK LLVRKLEKSL 

241 IVLQENRDEF MQELIKDCRK RMEKEGTVTD SEIEGNKKCL I EVLLTLQEN EPEYYKDEII ' 

301 RSLMLVIiLSA GTDTSVGTME WALSLMLNHP ETLKKAQAEI DEHIGHERLV DESDINNLPY 

361 LRCIINETFR MYPAGP LLVP HESSEETTVG GYRVPGGTML LVNLWAIHND PKLWDEPRKF 

421 KPERFQGLDG VRDGYKMMPF GSGRRSCPGE GLAVRMVAL S LGCIIQCFDW QRIGEELVDM 

481 TEGTGLTLPK AQPLVAKCSP RPKMANLLSQ I 
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NAME D 1 0 OA-AC 3 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. ^° ( ^^ CAAA ATTCCAATGG TTAACATGTT CACTCCAATT ATATACGCTC CTCTCCTTTT 
61 AGCTTTTTAC ATTAT CACAA AACATTTCTT ACGCAAACTC AGAAATAACC CACCAGCTCC 
121 ATTTCTTACT TTCCCCTTTA TTGGCCATCT TTATCTCTTC AAAAAACCAC TTCAACGTAC 
181 CTTAGCCAAA ATCTCCGAAC GTTATGGCTC TGTTCTTCTA CTCGAATTCG GTT CACGAAA 
241 AGTACTTTTG GTTTCTTCAC CAT CT GCAGC TGAAGAATGC TTAACAAAAA ACGATATTAT 
301 TTTCGCGAAT CGTCCTCTTT TGATGGCT GG AAAACAT CTT GGATATAATT TTACTT CTTT 
361 GGCTTGGAGT TCGTACGGAG AT CACT GGAG AAATCTTCGT AGGATTACTT CAGTT GAGAT 
421 GTTTTCGACT CATCGTCTTC AAATGCTACA TGGAATTCGT ATT GAT GAAG T GAAAT CTAT 
481 GGTTAAGAGG CTCAATTCCT CT GCCATAGC TGAAAAATCT GTGGATATGA AGTCTATGTT 
541 TTTTGAGCTG ATGCTCAATG TTAT GAT GAG GACAATT GCT GGAAAAAGAT ATTACGGTGA 
601 GAATGTGGAG GACATTGAGG AAGCT AC GAG ATTCAAAGGT TTGGTGCAAG AGACTTT CAG 

6 61 GATTGGCGGG GCGACGAATA TTGGCGACTT TTTGCCGGCG TTGAAGTTAT TGGTGAGGAA 
721 ATTGGAGAAA AGTTTAATTG TGTTGCAAGA GAACAGAGAT GAGTTTATGC AGGAATTAAT 

7 81 TAAAGATTGC AGAAAAAGAA TGGAGAAAGA AGGT ACT GTT ACT GATT CAG AAATT GAAGG 
841 GAACAAGAAA TGTTTAATTG AAGTTTTGTT AACACTACAA GAAAATGAAC C GGAAT ACT A 
901 CAAAGATGAA AT CAT CAGAA GCCTTATGCT T GTT CTAT TA T CAGCT GGTA CAGATACTTC 
961 AGTT GGGACA AT GGAAT GGG CTTTATCATT AAT GTTAAAC CACCCTGAAA CTCTGAAGAA 

1021 AGCACAAGCT GAAATT GAT G AACATATAGG ACATGAACGT TTAGTGGACG AGTCGGACAT 
1081 CAACAACCTA CCTTACCTAC GTT GTATAAT CAACGAGACA TTCCGAATGT ACCCTGCAGG 
1141 ACCACTACTA GTCCCACACG AGTCGTCAGA GGAAACCACC GTAGGAGGCT ACCGTGTACC 
12 01 CGGAGGAACC AT GT TACT T G TGAATTTGTG GGCTATT CAC AATGATCCAA AGCTAT GGGA 

12 61 TGAACCAAGA AAGTTTAAGC CAGAAAGATT TGAAGGACTA GAAGGTGTTA GAGACGGTTA 
1321 C AAAAT GAT G CCTTTTGGTT CTGGACGAAG GAGTTGTCCT GGAGAAGGAT TGGCTATTCG 

13 81 AATGGTTGCA TTGTCATTGG GAT GT ATTAT TCAATGCTTT GAT T GGCAAC GACTTGGGGA 
1441 AGGATTGGTT GATAAGACTG AAGGAACT GG ACTTACTTTG CCTAAAGCTC AAC CTTTAGT 
15 01 GGCCAAGTGT AGCCCACGAC CTAT AAT GGC TAATCTTCTT TCTCAGATTT GAACATAATT 
15 61 GGTTTCTACC AAACATCCCC AAACTAGAAT ATTATTATTG GTTACATATA C AAT GT AAT C 
1621 AATTTTGAAC CATATTATAT CTCAATGTAT TCCTTTTTAA AAAAAAAAAA AAAAA 

SEQ. ID. NO. 168 

1 MVNMFTPIIY APLLLAFYII TKHFLRKLRN NPPAPFLTFP FIGHLYLFKK PLQRTLAKIS 
61 ERYGSVLLLE FGSRKVLLVS SPSAAEECLT KNDI I FANRP LLMAGKHLGY NFTSLAWSSY 
121 GDHWRNLRRI TSVEMFSTHR LQMLHGIRID EVKSMVKRLN SSAIAEKSVD MKSMFFELML 
181 NVMMRT I AG K RYYGENVEDI EEAT RFKGLV QETFRIGGAT NIGDFLPALK LLVRKLEKSL 
241 IVLQENRDEF MQELIKDCRK RMEKEGTVTD SEIEGNKKCL IEVLLTLQEN EPEYYKDEII 
301 RSLMLVLLSA GTDTSVGTME WALSLMLNHP ETLKKAQAEI DEHIGHERLV DESDINNLPY 
361 LRCIINETFR MY PAG P L L VP HESSEETTVG GYRVP GGTML LVNLWAIHND PKLWDEPRKF 
421 KPERFEGLEG VRDGYKMMPF GSGRRSCPGE GLAIRMVALS LGCIIQCFDW. QRLGEGLVDK 
4 81 TEGTGLTLPK AQPLVAKCSP RPIMANLLSQ I 
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NAME D104A-AE8 (69,1755) 

ORGANISM NICOTIANA TABACUM 

SEQ. ID. ^0^169 ACTAT CTCCT AAATCTCCAC TCAAAAACAA AGAAGAGAAA GATTTAAAAC 

61 TAATAATTAT GAAAGAGATG GTGCAAAACA ATATGAGCAC TTCTCTTCTT GAAACTTTAC 
121 AAGCTACGCC CATGATATTC TACTTCATCG TCCCTCTCTT CTGCTTATTC CTTCTCTCCA 
181 AATCTCGCCG TAAACGTTTG CCTCCAGGTC CAACTGGCTG GCCTCTCATT GGTAACATGA 
241 TGATGATGGA CCAGTTAACT CACCGTGGCC TTGCCAAACT AGCCCAAAAA TATGGTGGTG 
301 TTTTTCACCT TAAAATGGGT TATGTTCACA AAATT GTAGT CT CTGGT CCA GACGAAGCTC 
361 GCXAAGTATT ACAGGAACAC GACATCATAT TTTCGAACCG TCCAGCGACC GTAGCCATAA 
111 GTTACCTAAC ATATGACAGG GCAGACAT GG CTTTTGCTGA CTATGGACTC TTCTGGCGGC 
481 AGATGAGAAA ACTATGTGTA ATGAAACTCT TCAGCCGCAA ACGAGCTGAG TCATGGGACT 
541 CAGTTCGAGA CGAAGCGGAT TCCATGGTTA GAATTGTAAC AACCAACACA GGCACAGCTG 
601 TTAACTTAGG TGAACTTGTT TTCAGTCTCA CTCGTAATAT TAT CTACAGA GCTGCTTTTG 
661 GAACTTGTT C T GAAGAT GGA CAAGGCGAGT TCATTAAAAT TATGCAAGAG TTTTCGAAGC 
721 TATTTGGTGC TTTCAATATA GCTGATTTTA TTCCATGGCT AGGGTGGGTT GGTAAGCAGA 
781 GTCTAAATAT TAGACTTGCT AAGGCTAGAG CGTCGCTTGA TGGGTTCATT GATTCGATTA 
841 TTGATGACCA TATTATTAGA AAGAAAGCTT ATGTTAATGG CAAAAATGAT GGAGGTGATC 
901 GAGAAACTGA TATGGTGGAT GAGCTTTTAG CTTTTTACAG TGAGGAAGCA AAAGTAACTG 
961 AGTCCGAAGA TTTGCAGAAT GCTATCAGAC TTACTAAGGA TAATATCAAA GCTAT CAT CA 
1021 TGGATGTAAT GTTTGGAGGG ACAGAAACAG TGGCTTCTGC AATAGAATGG GCCATGGCAG 
1081 AGCTTATGAG GAGTCCTGAA GATCTTAAAA AGGTACAACA AGAGCTGGCT AACGTTGTTG 
1141 GACT CAACAG AAAAGTTGAA GAATCTGACT TTGAAAAATT AACATACTTA AGATGTTGTC 

llol taaSgaaac TCTACGACTT CACCCTCCAA tccctctcct cctccatgag accgccgagg 
12 61 aatccaccgt ctccggctac catattccgg caaagtcaca tgttattata aattcatttg 

1321 CCATTGGGCG T GACAAAAAT TCATGGGAAG AT CCT GAAAC TTATAAACCA TCTAGGTTTC 
1381 TCAAAGAAGG TGTACCAGAT TTTAAA.GGAG GTAATTTTGA GTTTATACCA TTTGGGTCGG 
1441 GTCGGCGGTC TTGCCCCGGT ATGCAACTTG GGCTTTATGC ATTGGAAATG GCTGTGGCCC 
1501 ATCTTCTTCA TTGTTTTACT TGGGAATTGC CAGATGGTAT GAAACCAAGT GAGCTTAAAA 
1561 T GGAT GAT AT TTTTGGACTC ACTGCTCCAA GAGCTAATCG ACTCGTGGCT GTGCCTACTC 
1621 CACGTTTGTT GTGTCCCCTT TATTAATTGA AGAAAAAAGG TGGGGCTTTT ACTTGCATCA 
1681 AAGAGTGGTG CTTGTGATTT TTCCACCTTT T GGTTAAATA TACGAATTAT TAT GAT AT AC 
1741 GAATTCTTGG GCACA 

SEQ. ID. NC. 170 TSLLETLQAT PMIFYFIVPL FCLFLLSKSR RKRLPPGPTG WPLIGNMMMM 

61 DQLTHRGIAK LAQKYGGVFH LKMGYVHKIV VSGPDEARQV LQEHDIIFSN RPATVAISYL 
121 TYDRADMAFA DYGLFWRQMR KLCVMKLFSR KRAESWDSVR DEADSMVRIV TTNTGTAVNL 
181 GELVFSLTRN I I YRAAFGT C SEDGQGEFIK IMQEFSKLFG AFNIADFIPW LGWVGKQSLN 
241 IRLAKARASL DGFIDSIIDD HI I RKKAYVN GKNDGGDRET DMVDELLAFY SEEAKVTESE 
301 DLQNAIRLTK DNIKAIIMDV MFGGTETVAS AIEWAMAELM RSPEDLKKVQ QELANWGLN 
361 RKVEESDFEK LTYLRCCLKE TLRLHPPIPL LLHETAEEST VSGYHIPAKS HVIINSFAIG 
421 RDKNSWEDPE TYKPSRFLKE GVPDFKGGNF EFIPFGSGRR SCPGMQLGLY ALEMAVAHLL 
481 HCFTWELPDG MKPSELKMDD IFGLTAPRAN RLVAVPT PRL LCPLY 
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NAME D105-AD6 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 171 

1 TGTGCTTGTG AGTGTGGGAG AAGGCCTTCA AT AT G GAGAT ACCATATTAC AGCTTAAAAA 
61 TTGCAATTTC TTCATTTGCA ATTATCTTTG TACTAAGATG GGCATGGAAA ATCTTGAATT 
121 ATGTGTGGTT AAAACCAAAA GAATT GGAGA AATACCTCAG ACAGCAGGGT TTCAAAGGAA 
181 ACTCTTACAA ATTCTTGTTT GGGGATATGA AAGAGAT GAA GAAAAT GGGT GAAGAAGCTA 
241 TGTCTAAGCC AATCAATTTC T CT CAT GACA TGATTTGGCC TAGAGTTATG CCATTCATCC 
301 AC AAAAC CAT CACCAATTAT GGTAAGAATT GTATTGTGTG GTTT GGGCCA AGACCAGCAG 
361 TCCTGATCAC AGACCCGGAA CTT GTAAAGG AGGTGCTAAC GAAGAATTTC GTCTATCAGA 
421 AGCCGCTTGG CAATCCACTC ACAAAGTTGG CAGCAACTGG AATTGCAGGC TAT GAAACAG 
481 ATAAATGGGC TACACATAGA AGGCTTCTCA ATCCTGCTTT TCACCTTGAC AAGTT GAAGC 
541 ATATGCTACC TGCATTCCAA TTTACTGCTA GTGAGATGTT GAGCAAATTG GAGAAAGTTG 
601 TTTCACCAAA CGGAACAGAG ATAGATGTGT GGCCATATTT ACAAACTTTG ACAAGT GAT G 
661 CCATTTCAAG AACTGCGTTT GGAAGTAGTT AT GAAGAAGG AAGAAAGATT TTTGACCTTC 
721 AAAAAGAACA ACTTTCACTA ATTCTAGAAG TTTCACGCAC AATATATATT CCAGGAT GGA 
7 81 GGTTTTTGCC AACGAAAAGG AACAAAAGGA TGAAGCAAAT ATTTAAT GAA GTACGAGCAC 
841 TGGTATTTGG AATTAT TAAG AAAAGGATGA GTATGATTGA AAATGGAGAA GCACCT GAT G 
901 ATTTATTGGG AATATT ATT G GCATCCAATT TAAAAGAAAT CCAACAACAT GGAAACAACA 
961 AGAAATTTGG TATGAGTATT GATGAGGTGA TTGAAGAGTG TAAACTCTTC TATTTTGCTG 
1021 GGCAAGAGAC TACTT CAT CT TTACTT GTAT GGACTAT GAT TTTGTTGTGC AAATAT CCTA 
1081 ATT GGCAAGA TAAAGCTAGA GAAGAGGTTT TGCAAGTGTT TGGGAGTAGG GAAGTT GACT 
1141 AT GACAAGTT GAAT CAGCTA AAAATAGTAA CTATGATCTT AAAC GAGGT C TTAAGGTTGT 
1201 AT CCAGCAGG AT AT GT GAT T AATCGAATGG TAAACAAAGA AACAAAGT T A GGGAATTTGT 4 
1261 GTTTACCAGC CGGCGTACAG CTCGTGTTAC CAACAAT GTT GTT GCAACAT GAT ACT GAAA 
1321 TATGGGGAGA T GAT GCAAT G GAGTTCAATC CAGAGAGATT TAGTGATGGA ATAT CCAAAG 
1381 CAACAAAAGG AAAACTTGTG TTTTTTCCAT TTAGTTGGGG TCCAAGAATA TGTATTGGGC 
1441 AAAATTTTGC TAT GT T AGAG GCTAAAATGG CAAT GGCT AT GATTCTGAAA ACCT AT GCAT 
1501 TTGAACTCTC TCCATCTTAT GCTCATGCTC CTCATCCACT ACTACTT CAA CCTCAATATG 
1561 GTGCTCAATT AATTTT GTAC AAGTT GTAGA TAT GGT CAAT TTGGAACTTG TTAT GGAACT 
1621 TTTATCATTG TAATCAACCA TATTGAGGGA ACATGGTTTG AGGTTAAATC CTCGTGTGTG 
1681 TGTC 

SEQ. ID. NO. 172 

1 MEIPYYSLKI AISSFAIIFV LRWAWKILNY VWLKPKELEK YLRQQGFKGN SYKFLFGDMK 
61 EMKKMGEEAM SKPINFSHDM IWPRVMPFIH KTITNYGKNC IVWFGPRPAV LITDPELVKE 
121 VLTKNFVYQK PLGNPLTKLA AT G I AG YET D KWATHRRLLN PAFHLDKLKH MLPAFQFTAS 
181 EMLSKLEKW SPNGTEIDVW PYLQTLTSDA ISRTAFGSSY EEGRKI FDLQ KEQLSLILEV 
241 SRTIYIPGWR FLPTKRNKRM KQIFNEVRAL VFGI I KKRMS MIENGEAPDD LLGILLASNL 
301 KEIQQHGNNK KFGMSIDEVI EECKLFYFAG QETTSSLLVW TMILLCKYPN WQDKAREEVL 
361 QVFGSREVDY DKLNQLKIVT MILNEVLRLY PAGYVINRMV NKETKLGNLC LPAGVQLVLP 
421 TMLLQHDTEI WGDDAMEFNP ERFSDGISKA TKGKLVFFPF SWGPRICIGQ NFAMLEAKMA 
481 MAMILKTYAF ELSPSYAHAP HPLLLQPQYG AQLILYKL 
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NAME D109-AH8 (14,1697) 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID, NO. 173 

1 CCAGCACCAA GACAT GGAGA ATTCCTGGGT AGTTTTAGCC TTAACAGGCC TTCTTACATT 
61 AGTTTTTCTC TCAAAGTTTC TTCATAGTCC TCGTCGTAAA CAAAATCTTC CACCAGGTCC 
121 AAAACCATGG CCTATTGTTG GCAATATACA TCTT.CTTGGT TCCACCCCTC ACAGAT CCCT 
181 TCACGAACTT GCAAAAAGAT AC GGAGATTT AAT GCT ACTA AAGTTCGGTT CGCGCAATGT 
241 CCTTATTTTA TCCTCCCCAG AT AT GGCTAG AGAATTCTTG AAAACAAATG ATGCCATTTG 
301 GGCTTCTCGC CCTGAGCTTG CCGCTGGTAA ATATACT GCT TATAATTATT GCGACAT GAC 
361 ATGGGCACGT TAT GGACCCT TTTGGAGACA AGCAAGGAGG ATCTATCTCA ACGAGATTTT 
421 CAAT CCTAAA CGTTTGGATT CATTT GAGTA CATTCGCATA GAGGAAAGGC ATAATTTGAT 
481 TTCACGTCTT TTTGTTCTCT CTGGGAAGCC AATTCTTCTT AGAGACCATT TAACTCGGTA 
541 CACTCTTACA AGTATAAGTA GAACAGTATT GAGTGGAAAA TATTTTAGCG AGTCACCTGG 
601 CCAAAATTCA AT GATAACTT TGAAACAATT GCAGGATATG CTTGATAAGT GGTTTTTGCT 
661 TAATGGTGTG AT CAAT ATT G GGGACT GGAT ACCTTGGCTT GCTTTCTTGG ATTTGCAGGG 
721 TTATGTCAAG CAAAT GAAGG AGTT GCATAG GAACTT C GAC AAATTT CATA ACTTTGTGCT 
7 81 AGATGATCAC AAGGCTAATA GGGGAGAGAA GAACTTTGTG C CAAGAGAC A TGGTCGATGT 
841 TTTGCTGCAG CAAGCTGAGG AT CCTAAT CT TGAGGTCAAA CT C AC CAAT G ATTGT GTCAA 
901 GGGTCTAATG CAGGACTTAT TGGCTGGCGG CACGGACACC TCAGCAACAA CCGTTGAATG 
961 GGCTTTTTAT GAACTT CTTA GACAACCTAA GAT TAT G AAG AAAGCACAAC AAGAGCTAGA 
1021 CCTTGTCATT TCACAGGACA GATGGGTTCA AGAAAAAGAT T AC ACT CAAC TCCCTTACAT 
1081 TGAGTCAATC AT CAAGGAAA CATTGAGGCT T CACCCAGTA AGCAC CAT GC TTCCACCGCG 
1141 CATTGCCTTG GAGGATTGTC AT GTAGCAGG CTAT GACATA CCTAAAGGTA CAATTTTAAT 
1201 TGTGAACACT TGGAGTATTG GAAGAAATTC ACAGCATTGG GAGTCACCAG AAGAATTCCT 
12 61 TCCGGAGAGG TTTGAAGGGA AGAATATTGG TGTCACAGGA CAACATTTTG CGCTCTTGCC 
1321 ATTTGGCGCG GGCCGGAGAA AGTGCCCAGG ATACAGT CTT GGGATTCGTA TAATTAGGGC 
1381 AACTTTAGCT AACTTGTTGC AT GGAT T CAA CTGGAGATTG CCTAAT GGTA TGAGTCCAGA 
14 41 AGACATTAGC AT GGAAGAGA TTTATGGGCT AATTACACAC CCCAAAGTCG CACTT GAC GT 
1501 GAT GAT GGAG CCTCGACTTC CCAACCATCT TTACAAATAG TGGATAATTA AAACCATTAA 
1561 AATCGTTTTG TTATAT GCAT GT CT CAT ATT TGTAGTGGTC AAAATGTTTG TTTTCTATCA 
1621 TGGATGTTCA GTGCGAGGTT GGGAATTTCA AGT CATTAAC GT GT GAAAAT ATTTTAAATT 
1681 TAAAAAAAAA AAAAAAA 

SEQ . ID. NO. 174 

1 MENSWWLAL TGLLTLVFLS KFLHSPRRKQ NLPPGPKPWP IVGNIHLLGS TPHRSLHELA 

61 KRYGDLMLLK FGSRNVLILS SPDMAREFLK TN DAI WAS RP ELAAGKYTAY NYC DMT WARY 

121 GPFWRQARRI YLNEIFNPKR LDSFEYIRIE ERHNLISRLF VLSGKPILLR DHLTRYTLTS 

181 ISRTVLSGKY FSESPGQNSM ITLKQLQDML DKWFLLNGVI NIGDWIPWLA FLDLQGYVKQ 

241 MKELHRNFDK FHN FVLDDHK ANRGEKNFVP RDMVDVLLQQ AEDPNLEVKL TNDCVKGLMQ 

301 DLLAGGTDTS ATTVEWAFYE LLRQPKIMKK AQQELDLVIS QDRWVQEKDY TQLPYIESII 

361 KETLRLHPVS TMLPPRIALE DCHVAGYD I P KGTILIVNTW SIGRNSQHWE SPEEFLPERF 

421 EGKNIGVTGQ HFALLPFGAG RRKCPGYSLG IRIIRATLAN LLHGFNWRLP NGMSPEDISM 

481 EEIYGLITHP KVALDVMMEP RLPNHLYK 
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NAME D110-AF12 (166,1631) 

ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 175 

1 ACT GTT CAAA TCACAGTAAC AGCATCTTGT GCT GCCATAA TAATT ACT CT AGTGGTGTGT 
61 ATATGGAGAG TGCTGAATTG GGTTTGGTTC AGACCAAAGA AGCTGGAAAA GCTACTGAGG 
121 AAACAAGGTC TCAAAGGCAA TTCCTACAGG ATTTTGTATG GGGATATGAA GGAGCTTTCT 
181 GGTAT GATTA AGGAAGCTAA CTCCAAACCC ATGAATCTTT CT GAT GAT AT TGCCCCAAGA 
241 TTGGTCCCTT TCTTTCTTGA TACCAT CAAG AAATAT GGGA AAAAAT CCTT TGTATGGTTG 
301 GGTCCAAAAC CGCTGGTTTT T GT CAT GGAC CCCGAGCTTA TAAAGGAAGT ATT CT C CAAA 
361 AACTATCTGT AT CAAAAGCC TCATTCAAAT CCATTAACCA AGTTACT GGC ACAAGGACTT 
421 GTAAGCCAAG AGGAAGACAA ATGGGCCAAA CATAGAAAAA TCGTCACTCC TGCCTTCCAC 
481 CT GGAGAAGC TAAAGCATAT GCTTCCAGCT TTTTGTTTGA GCTGTACTGA GATGCTGAGC 
541 AAAT GGGAAG ACATTGTTGC AGTTGAGGGC T CACAT GAGA T AGAT AT AT G GCCTGGCCTT 
601 CAACAATTAA CTAGTGATGT GATCTCTCGG ACAGCCTTTG GCAGTAGCTA TGAAGCAGGT 
661 AGAAGGATAT TTGAACTTCA AAAGGAACAA GCTCAATTTC TTAT GGAAGC TATACGCTCC 
721 GTTTATATTC CAGGCTGGAG GTTTTTGCCA ACAAAGAGGA ACAGAAGAAT GAAGGAAATT 
781 GAAAAGGATG TTCAAGCCTT AGTTAGAGGT ATTATTGATA AAAGAGTAAA GTCAATGAAA 
841 GCAGGAGAGG TGAATAATGA GGATCTGCTT GGTATATTGC TGGAATCTAA TTTTAAAGAA 
901 AT T GAACAGC AT G G AAACAA GGATTTTGGA AT GAGCATTG AAGAAGT CAT TCAAGAATGC 
961 AAGTTATT CT ATTTTGCTGG CCAAGAAACT AC AT CAGT GT TGCTTGTATG GACT CTAATA 
1021 TTGCTGAGCA GGCAT CAGGA TTGGCAAGCA CTGGCCAGAG AAGAGGTGTT GCAAGTCTTT 
1081 GGGAATCAGA AACCAGATTT T GAT GGATT A AATCGTCTAA AAATTGTTAC AATGATCTTG 
1141 TACGAGTCTT TAAGGCTCTA TCCCCCAGTA GT GACACTTA CCCGAAGGCC TAAGGAAGAC 
1201 ACT GTATTAG GAGAT GT AT C TCTACCAGCA GGTGTGTTAA TCTCCTTACC AGTGATCTTA 
1261 TTGCATCACG ACGAAGAGAT AT GGGGTAAA GAT GCAAAGA AGTT CAAGCC AGAGAGATTC 
1321 AGAGAT GGAG TCTCAAGTGC AACAAAGGGT CAAGT CACTT TTTTCCCATT TACTT GGGGT 
1381 CCCAGAATAT GCATT GGACA AAATTTTGCC AT GTTAGAAG CAAAGACTAC TTTGGCTATG 
1441 AT C CTACAAC GCTTCTCCTT TGAACTGTCT C CAT CTT AT G CACATGCTCC T CAGT C CAT A 
1501 ATAACTTTGC AAC CCCAGT A TGGTGCTCCA CTTATTTTGC ATAAAATATA GTTTATTACT 
1561 TGTAAGTAGT GTCTCGTTTT AT GTTAAGCA TGAGTCCAAA AT GTTAAGGC TTGTAGAACT 
1621 GCAAAAT GGG A 



• IN W * 1 I v 

1 MKELSGMIKE ANSKPMNLSD DIAPRLVPFF LDTIKKYGKK SFVWLGPKPL VFVMDPELIK 
61 EVFSKNYLYQ KPHSNPLTKL IAQGLVSQEE DKWAKHRKIV TPAFHLEKLK HMLPAFCLSC 
121 TEMLSKWEDI VAVEGSHEID IWPGLQQLTS DVISRTAFGS SYEAGRRIFE LQKEQAQFLM 
181 EAIRSVYIPG WRFLPTKRNR RMKEIEKDVQ ALVRGIIDKR VKSMKAGEVN NEDLLGILLE 
241 SNFKEIEQHG NKDFGMSIEE VIQECKLFYF AGQETTSVLL VWTLILLSRH QDWQALAREE 
301 VLQVFGNQKP DFDGLNRLKI VTMILYESLR LYPPWTLTR RPKEDTVLGD VSLPAGVLIS 
361 LPVILLHHDE EIWGKDAKKF KPERFRDGVS SATKGQVTFF PFTWGPRICI GQNFAMLEAK 
421 TTLAMILQRF SFELSPSYAH APQSIITLQP QYGAPLILHK I 
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NAME D112-AA5 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 177 

1 ATTTATCTCT GAAAATGCAA TTCTTCAGCT TGGTTTCCAT TTTCCTATTC CTATCTTTCC 
61 TATTTTTGTT GAGGAAATGG AAGAACTCCA ATAGCCAAAG CAAAAAATTG CCACCAGGTC 
121 CAT GGAAAAT ACCAATACTA GGAAGTATGC T T CAT AT GAT TGGTGGAGAA CCGCACCATG 
181 T CCTTAGAGA TTTAGCCAAA AAAGAT GGAC CACTTAT GCA CCTTCAGTTA GGTGAAATTT 
241 CTGCAGTTGT GGTTACTTCT AGGGACATGG CAAAAGAAGT GCTAAAAACT CATGACGTCG 
301 TTTTTGCATC TAGGCCTAAA ATTGTAGCCA TGGACATTAT CT GTTATAAC CAGTCCGACA 
361 TTGCCTTTAG CCCTTATGGC GACCACTGGA GACAAATGCG TAAAATTTGT GT CAT GGAAC 
421 TTCTCAATGC AAAGAAT GTT CGGTCTTTCA GCTCCATCAG ACGTGATGAA GTCGTTCGTC 
4 81 TCATTGACTC TATCCGGTCA GATTCTTCTT CAGGTGAGCT AGTTAATTTT ACGCAGAGGA 
541 TCATTTGGTT TGCAAGCTCC ATGACGTGTA GAT C AG CAT T TGGGCAAGTA CTCAAGGGGC 
601 AAGACATATT TGCCAAAAAG AT C AGAGAAG TAATAGGATT AGCAGAAGGC TTTGATGTGG 
661 TAGACAT CTT CCCTACATAC AAGTTTCTTC AT GTTCT CAG TGGGATGAAG CGTAAACTTT 
721 TGAATGCCCA CCTTAAGGTA GACGCCATTG TTGAGGATGT CATCAACGAG CACAAGAAAA 
781 ATCTTGCAGC TGGCAAGAGT AAT GGCGCAT TAGGAGGCGA AGATCTAATT GATGTCCTAC 
841 TGAGACTTAT GAATGACACA AGTCTTCAAT TTCCCATCAC CAACGACAAT AT CAAAGCT G 
901 TTGTTGTTGA CATGTTTGCT GCCGGAACAG AAACTTCATC AACAACAACT GTATGGGCCA 
961 TGGCTGAAAT GAT GAAGAAT CCAAGTGTAT TCGCCAAAGC TCAAGCAGAA GT GCGAGAAG 
1021 CCTTTAGGGA CAAAGT AT CT TTTGATGAAA ATGATGTGGA GGAGCTGAAA TACTTAAAGT 
1081 TAGT CATTAA AGAAACTTTG AGACTT CAT C CACCGTCTCC ACTTTTGGTC CCAAGAGAAT 
1141 GCAGGGAAGA TACGGATATA AACGGCTACA CTATTCCTGC AAAGACCAAA GTTAT GGTTA 
1201 ATGTTTGGGC ATT GGGAAGA GAT CCAAAAT ATTGGGATGA CGCGGAAAGC TTTAAGCCAG 
1261 AGAGATTTGA GCAAT GTTCT GTAGATATTT TTGGTAATAA TTTTGAGTTT CTTCCCTTTG 
1321 GCGGGGGACG GAGAATTTGT CCTGGAATGT CATTT GGTTT AGCTAATCTT TACTTACCAT 
1381 TGGCTCAATT ACTCTATCAC TTTGACTGGA AACTCCCAAC CGGAAT CAAG CCAAGAGACT 
1441 TGGACTT GAC CGAATTATCG GGAATAACTA TTGCTAGAAA GGGTGACCTT TACTTAAATG 
1501 CTACTCCTTA TCAACCTTCT CGAGAGTAAT TTACTATTGG CATAAACATT TTAAATTTCC 
1561 T T CAT CAACC TC 

SEQ . ID. NO. 17 8 

1 MQFFSLVSIF LFLSFLFLLR KWKNSNSQSK KLPPGPWKIP I LGSMLHMI G GEPHHVLRDL 
61 AKKDGPLMHL QLGEISAVW TSRDMAKEVL KTHDWFASR PKIVAMDIIC YNQSDIAFSP 
121 YGDHWRQMRK ICVMELLNAK NVRSFSSIRR DEWRLIDSI RSDSSSGELV NFTQRIIWFA 
181 SSMTCRSAFG QVLKGQDIFA KKIREVIGLA EGFDWDIFP TYKFLHVLSG MKRKLLNAHL 
241 KVDAIVEDVI NEHKKNLAAG KSNGALGGED LIDVLLRLMN DTSLQFPITN DNIKAVWDM 
301 FAAGTETSST TTVWAMAEMM KNPSVFAKAQ AEVREAFRDK VSFDENDVEE LKYLKLVIKE 
361 TLRLHPPSPL LVPRECREDT DINGYTIPAK TKVMVNWAL GRDPKYWDDA ESFKPERFEQ 
421 CSVDIFGNNF EFLPFGGGRR ICPGMSFGLA NLYLPLAQLL YHFDWKLPTG IKPRDLDLTE 
481 LSGITIARKG DLYLNATPYQ PSRE 
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NAME D120-AH4 

ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 179 

1 ATAATGCTTT CTCCCATAGA AGCCATTGTA GGACTAGTAA CCTTCACATT TCTCTTCTTC 
61 TTCCTATGGA CAAAAAAATC TCAAAAACCT TCAAAACCCT TACCACCGAA AATCCCCGGA 
121 GGATGGCCGG TAAT CGGCCA TCTTTTCCAC TTCAATGACG ACGGCGACGA CCGTCCATTA 
181 GCTCGAAAAC TCGGAGACTT AGCT GACAAA TACGGCCCCG TTTTCACTTT TCGGCTAGGC 
241 CTTCCCCTTG TCTTAGTTGT AAGCAGT TAC GAAGCT GTAA AAGACT GTTT CTCTACAAAT 
301 GACGCCATTT TTTCCAATCG TCCAGCTTTT CTTTACGGCG ATTACCTTGG CTACAATAAT 
361 GCCAT GCTAT TTTTGGCCAA TTACGGACCT TACTGGCGAA AAAATCGAAA ATTAGT TATT 
421 CAG GAAGT T C TCTCCGCTAG TCGTCTCGAA AAATT CAAAC ACGT GAGATT TGCAAGAATT 
481 CAAGCGAGCA TTAAGAATTT AT AT ACT CGA ATTGATGGAA ATT CGAGTAC GATAAATTTA 
541 ACTGATTGGT TAGAAGAATT GAATTTTGGT CTGATCGTGA AGATGATCGC TGGAAAAAAT 
601 TATGAATCCG GTAAAGGAGA TGAACAAGTG GAGAGATTTA AGAAAGCGTT TAAGGATTTT 
661 AT GATTTTAT CAATGGAGTT TGTGTTATGG GATGCATTTC CAATT CCATT ATTTAAATGG 
721 GTGGATTTTC . AAGGGCAT GT TAAGGCTATG AAAAGGACTT TTAAAGATAT AGATTCTGTT 
7 81 TTTCAGAATT GGTTAGGGGA ACATATTAAT AAAAGAGAAA AAAT GGAGGT TAAT GCAGAA 
841 GGGAAT GAAC AAGATTTCAT TGATGTGGTG CTTTCAAAAA TGAGTAATGA ATATCTTGGT 
901 GAAGGT TACT CTCGTGATAC TGTCATTAAA GCAACGGTGT TTAGTTTGGT CTTGGATGCA 
961 GCAGACACAG TTGCTCTTCA CATAAATTGG GGAAT GGCAT TATT GAT AAA CAATCAAAAG 
1021 GCCTTGACGA AAGCACAAGA AGAGATAGAC ACAAAAGTTG GTAAGGACAG ATGGGTAGAA 
1081 GAGAGT GAT A TTAAGGATTT GGTATACCTC CAAGCTATTG TTAAAGAAGT GTTACGATTA 
1141 TAT C CACCAG GACCTTTGTT AGT AC C AC AC GAAAAT GT AG AAGATTGTGT TGTTAGTGGA 
1201 TATCACATTC CTAAAGGGAC AAGATTATTC G CAAAC GT C A TGAAACTGCT ACGTGATCCT 
1261 AAACTCTGGC CTGATCCTGA TACTTTCGAT CCAGAGAGAT TCATTGCTAC TGATATTGAC 
1321 TTTCGTGGTC AGTACTATAA GTATATCCCG TTTGGTTCTG GAAGACGATC TTGTCCAGGG 
1381 AT GACT TAT G CATT GCAAGT GGAACACTTA ACAAT GGCAC ATTTGATCCA AGGTTT CAAT 
1441 TACAGAACT C CAAAT GACGA GCCCTTGGAT AT GAAGGAAG GT GCAGGCAT AACT AT AC GT 
1501 AAGGTAAATC CTGTGGAACT GATAATAGCG CCTCGCCTGG CACCTGAGCT TTATTAAAAC 
1561 CTAAGATCTT TCATCTTGGT T GAT CAT T GT ATAATACTCC T AAAT G GAT A TTCATTTACC 
1621 TTT TAT CAAT TAA 

SEQ. ID. NO. 180 

1 MLSPIEAIVG LVTFTFLFFF LWTKKSQKPS KPLPPKIPGG WPVIGHLFHF NDDGDDRPLA 
61 RKLGDLADKY GPVFTFRLGL PLVLWSSYE AVKDCFSTND AIFSNRPAFL YGDYLGYNNA 
121 MLFLANYGPY WRKNRKLVIQ EVLSASRLEK FKHVRFARIQ ASIKNLYTRI DGNSSTINLT 
181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFKKAFKDFM ILSMEFVLWD AFPIPLFKWV 
241 DFQGHVKAMK RTFKDIDSVF QNWLGEHINK REKMEVNAEG NEQDFIDWL SKMSNEYLGE 
301 GYSRDTVIKA TVFSLVLDAA DTVALHINWG MALLINNQKA LTKAQEEIDT KVGKDRWVEE 
361 SDIKDLVYLQ AIVKEVLRLY PPGPLLVPHE NVEDCWSGY HIPKGTRLFA NVMKLLRDPK 
421 LWPDPDTFDP ERFIATDIDF RGQYYKYIPF GSGRRSCPGM TYALQVEHLT MAHLIQGFNY 
481 RTPNDEPLDM KEGAGITIRK VNPVELIIAP RLAPELY 



WO 2004/035745 

FIG. 91 



32/111 



PCT/US2003/032722 



NAME D121-AA8 

ORGANISM N I COT I ANA TABACUM 

SEQ . ID. NO. 181 

1 AATCCATAAT GCTTTCTCCC ATAGAAGCCA TTGTAGGACT 
61 TCTTCTTCCT AT GGACAAAA AAATCTCAAA AACCTTCAAA 
121 CCGGAGGATG GCCGGTAATC GGCCATCTTT TCCACTTCAA 
181 CATTAGCTCG AAAACTCGGA GACTTAGCTG ACAAATACGG 
241 TAGGCCTTCC CCTTGTCTTA GTTGTAAGCA GTTACGAAGC 
301 CAAATGACGC CATTTTTTCC AATCGTCCAG CTTTTCTTTA 
361 ATAATGCCAT GCTATTTTTG GCCAATTACG GACCTTACTG 
421 TTATTCAGGA AGTTCTCTCC GCTAGTCGTC TCGAAAAATT 
481 GAATT CAAGC GAGCATTAAG AATTTATATA CTCGAATTGA 
541 ATTTAACTGA TTGGTTAGAA GAATT GAATT TTGGTCTGAT 
601 AAAATTATGA AT CCGGTAAA GGAGAT GAAC AAGT GGAGAG 
661 ATTTTATGAT TTTATCAATG GAGTTTGTGT TATGGGATGC 
721 AATGGGTGGA TTTTCAAGGG CATGTTAAGG CTAT GAAAAG 
7 81 CTGTTTTTCA GAATT GGTTA GAGGAACATA TTAATAAAAG 
841 CAGAAGGGAA T GAAC AAGAT TTCATTGATG TGGTGCTTTC 
901 TTGGTGAAGG TTACTCTCGT GAT ACT GT CA TTAAAGCAAC 
961 AT GCAGCAGA CACAGTT GCT CTT CACATAA ATTGGGGAAT 
1021 AAAAGGCCTT GACGAAAGCA CAAGAAGAGA TAGACACAAA 
1081 TAGAAGAGAG T GATATTAAG GATTT GGTAT ACCTCCAAGC 
1141 GATTATATCC ACCAGGACCT TTGTTAGTAC CACACGAAAA 
1201 GTGGATATCA CATT CCTAAA GGGACAAGAT TATTCGCAAA 
1261 AT CCTAAACT CTGGTCTGAT CCTGATACTT T CGAT CCAGA 
1321 TTGACTTTCG TGGTCAGTAC TATAAGTATA TCCCGTTTGG 
1381 CAGGGATGAC TTATGCATTG C AAGT G GAAC ACT TAACAAT 
1441 TCAATTACAG AACTCCAAAT GACGAGCCCT TGGATATGAA 
1501 TACGTAAGGT AAATCCTGTG GAACTGATAA TAGCGCCTCG 
1561 AAAAC C T AAG AT CAT CTT GC TTGAT 



AGTAACCTTC 

ACCCTTACCA 

TGACGACGGC 

CCCCGTTTTC 

T GT AAAAGAC 

CGGCGATTAC 

GCGAAAAAAT 

CAAACACGTG 

TGGAAATTCG 

CGTGAAGATG 

ATTTAAGAAA 

ATTTCCAATT 

GACTTTTAAA 

AGAAAAAATG 

AAAAAT GAGT 

GGT GTTTAGT 

GGCATTATTG 

AGTTGGTAAG 

TATT GTTAAA 

T GT AGAAGAT 

CGT CAT GAAA 

GAGATTCATT 

TTCTGGAAGA 

GGCACATTTG 

GGAAGGTGCA 

CCTGGCACCT 



ACATTTCTCT 

CCGAAAATCC 

GACGACCGTC 

ACTTTTCGGC 

TGTTTCTCTA 

CTTGGCTACA 

CGAAAATTAG 

AGATTTGCAA 

AGTACGATAA 

ATCGCTGGAA 

GCGTTTAAGG 

CCATTATTTA 

GATATAGATT 

GAGGTTAATG 

AATGAATATC 

TTGGTCTTGG 

AT AAACAAT C 

GACAGATGGG 

GAAGT GT T AC 

TGTGTTGTTA 

CTGCAACGTG 

GCT ACT GAT A 

CGATCTTGTC 

ATCCAAGGTT 

GGCATAACTA 

GAGCTTTATT 



SEQ. ID. NO. 182 

1 MLSPIEAIVG LVTFTFLFFF LWTKKSQKPS KPLPPKIPGG WPVT GHLFHF NDDGDDRPLA 
' 61 RKLGDLADKY GPVFTFRLGL PLVLWSSYE AVKDCFSTND AIFSNRPAFL YGDYLGYNNA 
121 MLFIANYGPY WRKNRKLVIQ EVLSASRLEK FKHVRFARIQ ASIKNLYTRI DGNSSTINLT, 
181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFKKAFKDFM ILSMEFVLWD AFPIPLFKWV 
241 DFQGHVKAMK RTFKDIDSVF QNWLEEHINK RE KMEVNAEG NEQDFIDWL SKMSNEYLGE 
301 GYSRDTVIKA TVFSLVLDAA DTVALHINWG MALLINNQKA LTKAQEEIDT KVGKDRWVEE 
361 SDIKDLVYLQ AIVKEVLRLY PPGPLLVPHE NVEDCWSGY HIPKGTRLFA NVMKLQRDPK 
421 LWSDPDTFDP ERFIATDIDF RGQYYKYIPF GSGRRSCPGM TYALQVEHLT MAHLIQGFNY 
481 RTPNDEPLDM KEGAGITIRK VNPVELIIAP RLAPELY 
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NAME D122-AF10 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 183 

1 CTAAAACTCC ATAATGGTTT CTCCCGTAGA AGCCATTGTA GGACTAGTAA CCCTTACACT 
61 TCTCTTCTAC TTCCTATGGC CCAAAAAATT TCAAATACCT TCAAAACCAT TACCAC CGAA 
121 AATTCCCGGA GGGTGGCCGG TAATCGGCCA TCTTTTCTAC TTCGATGATG ACGGCGACGA 
181 CCGTCCATTA GCTCGAAAAC T C GGAGACT T AGCT GACAAA TACGGCCCGG TTTTCACTTT 
241 CCGGCTAGGC CTTCCGCTTG TGTTAATTGT AAGCAGTTAC GAAGCTGTAA AAGACTGCTT 
301 CTCTACAAAT GACGCCATTT TCTCCAATCG TCCAGCTTTT CTTTACGGTG AATACCTTGG 
361 CTACAATAAT GCCATGCTAT TTTT GACAAA ATACGGACCT TATTGGCGAA AAAATAGAAA 
421 AT T AGT CAT T CAGGAAGTTC TCTCTGCTAG TCGTCTCGAA AAATT GAAGC ACGT GAGATT 
481 TGGTAAAATT CAAACGAGCA TTAAGAGTTT ATACACTCGA ATTGATGGAA ATT CGAGTAC 
541 GATAAATCTA ACTGATTGGT TAGAAGAATT GAATTTTGGT CTGATCGTGA AAATGATCGC 
601 TGGGAAAAAT TATGAATCCG GTAAAGGAGA TGAACAAGTG GAGAGATTTA GGAAAGCGTA 
661 TAAGGATTTT ATAATTTTAT CAAT GGAGTT TGTGTTATGG GATGCTTTTC CAATTCCATT 
721 GTTCAAATGG GTGGATTTTC AAGGCTATGT TAAGGCCATG AAAAGGACAT TTAAGGATAT 
781 AGATTCTGTT TTTCAGAATT GGTTAGAGGA ACATGTCAAG AAAAGAGAAA AAAT GGAGGT 
841 TAAT GCACAA GGGAAT GAAC AAGATTT CAT TGATGTGGTG CTTTCAAAAA TGAGTAATGA 
901 ATATCTTGAT GAAGGTTACT CTCGTGATAC TGTCATAAAA GCAACAGTGT TTAGTTTGGT 
961 CTTGGATGCT GCGGACACAG TTGCTCTTCA CAT GAATT GG GGAAT GGCAT TACT GATAAA 
1021 CAATCAACAT GCCTTGAAGA AAGCACAAGA AGAGAT CGAT AAGAAAGTTG GTAAGGAAAG 
1081 ATGGGTAGAA GAGAGT GAT A TTAAGGATTT GGTCTACCTC CAAGCTATTG TTAAAGAAGT 
1141 GTTACGATTA TATCCACCAG GACCTTTATT AGTACCT CAT GAAAAT GT AG AGGATTGTGT 
1201 TGTTAGTGGA TAT C AC AT T C CTAAAGGGAC TAGACTATTC GCGAACGTTA TGAAATTGCA 
1261 GCGCGATCCT AAACTCTGGT CAAATCCTGA TAAGTTTGAT CCAGAGAGAT TCTTCGCTGA 
1321 TGATATTGAC TACCGTGGTC AGCACTATGA GTTTAT CCCA TTTGGTTCTG GAAGACGATC 
1381 TTGTCCGGGG AT GACT TAT G CATTACAAGT GGAACACCTA ACAATAGCAC ATTTGATCCA 
1441 GGGTTTCAAT TACAAAACTC CAAAT GACGA GCCCTTGGAT AT GAAGGAAG GTGCAGGATT 
1501 AACTATACGT AAAGT AAAT C CTGTAGAAGT GACAAT TACG GCTCGCCTGG CACCTGAGCT 
1561 TTATTAAAAC CTTAGATGTT TTATCTTGAT T GT ACTAATA TAT AT AT GCA GAAAAAATTG 

SEQ . ID. NO. 184 

1 MVSPVEAIVG LVTLTLLFYF LWPKKFQIPS KPLPPKIPGG WPVI GHLFYF DDDGDDRPLA 
61 RKLGDLADKY GPVFTFRLGL PLVLIVSSYE AVKDCFSTND AIFSNRPAFL YGEYLGYNNA 
121 MLFLTKYGPY WRKNRKLVIQ EVLSASRLEK LKHVRFGKIQ TSIKSLYTRI DGNSSTINLT 
181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFRKAYKDFI ILSMEFVLWD AFPIPLFKWV 
241 DFQGYVKAMK RTFKDIDSVF QNWLEEHVKK REKMEVNAQG NEQDFI DWL SKMSNEYLDE 
301 GYSRDTVIKA TVFSLVLDAA DTVALHMNWG MALL I NNQHA LKKAQEEIDK KVGKERWVEE 
361 SDIKDLVYLQ AIVKEVLRLY PPGPLLVPHE NVEDCWSGY HI PKGTRLFA NVMKLQRDPK 
421 LWSNPDKFDP ERFFADDIDY RGQHYEFIPF GSGRRSCPGM TYALQVEHLT IAHLIQGFNY 
481 KTPNDEPLDM KEGAGLTIRK VNPVEVTITA RLAPELY 
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NAME D128-AB7 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 185 

1 CGAGGCTCCC CACCAAAAAA TCATTTCTCT 
61 AGAC CTTAAT TGGTCTTTTC TTTGCCATTT 
121 CAAAGCGTTT TAAGCTTCCT CCAGGACCAA 
181 AAGTTGGTGA TGATTTAAAC CACAGAAATC 
241 TTTTCTTGTT AAGAAT GGGT CAACGTAACT 
3 01 AAGAAGTTTT ACACACACAA GGTGTTGAAT 
3 61 ATATTTTTAC TGGAAAAGGT CAAGATATGG 
421 AAAT GAGGAG AATTATGACT GTACCATTTT 
481 GGGGGTGGGA GTTTGAGGTG GCAAGTGTAA 
541 CTACTAATGG GAT CGT ATT A AGGAGGAGAT 
601 GGATTATGTT T GAT AGGAGA TTTGAGAGTG 
661 CTTTGAATGG TGAAAGGAGT AGATTGGCTC 
721 TTCCAATTTT GAGGCCTTTT TTGAGAGGTT 
781 AGAGGCT GCA GCTTTTCAAA GATTACTTTG 
8 41 AGAGCT CGGA CAGCAAT GCC CTAAAATGTG 
901 AGGGAGAGAT CAATGAGGAC AACGTTCTTT 
961 TTGAAACAAC ATTATGGTCA ATTGAGTGGG 
1021 TCCAAAAGAA ACTGCGCGAC GAGATTGACA 
10 81 AACCAGACAC CCACAAGCTT CCATACCTTC 
1141 GT AT GGCAAT TCCTCTATTA GTCCCACACA 
12 01 T T GAT ATT CC AGCAGAGAGC AAAATCTTGG 
12 61 CTCATTGGAA GAAACCCGAA GAGTT CAGAC 
1321 TTGAGGCCAA T GGCAAT GAG TT CAGATAT C 
1381 CTGGAATTAT ACTTGCATTG CCAATTCTTG 
1441 TTGAGCTGTT GCCTCCTCCA GGCCAGTCGA 
1501 TCAGTCTCCA CATTTTGAAG CATTCCACCA 
1561 T GT GAT CT T A TTAATTAAGG GGTTCTGAAG 
1621 ATT 

SEQ. ID. NO. 18 6 

1 MDLLLLEKTL IGLFFAILIA LIVSKLRSKR 
61 YAKKFGDLFL LRMGQRNLW VSSPELAKEV 
121 VYGEHWRKMR RIMTVPFFTN KWQQYRGGW 
181 MMYNNMFRIM FDRRFESEDD PLFVKLKALN 
241 ICKEVKEKRL QLFKDYFVDE RKKLSNTKSS 
301 ENINVAAIET TLWSIEWGIA ELVNHPHIQK 

3 61 IKEALRLRMA IPLLVPHMNL HDAKLGGFDI 
421 FFEEEKHVEA NGNDFRYLPF GVGRRSCPGI 

4 81 TTEKGGQFSL HILKHSTIVL KPRSF 



CGTCTAAAAT GGATCTTCTC TTACTAGAGA 
TAATCGCTTT AATTGTCTCT AAACTTCGTT 
TT CCAGTAC C AGTTTTTGGT AATTGGCTTC 
T TACT GATTA TGCCAAAAAA TTTGGCGATC 
TAGTTGTTGT GT CAT CT C CT GAATTAGCTA 
TTGGTTCAAG AACAAGAAAT GTTGTGTTTG 
TTTTTACTGT ATATGGTGAA CATTGGAGAA 
TTACTAATAA AGTTGTGCAA CAGTATAGAG 
TTGAGGAT GT GAAAAAAAAT CCTGAATCTG 
TACAATTAAT GATGTATAAT AATATGTTTA 
AAGATGATCC TTTGTTTGTT AAGCTTAAGG 
AAAGTTTTGA GTATAATTAT GGTGATTTTA 
ATTTGAAGAT CT GTAAAGAA GTTAAGGAGA 
TT GAT GAAAG AAAGAAGCTT T CAAAT AC C A 
C GAT T GAT C A CATTCTTGAG GCT CAACAGA 
ACATTGTTGA AAACATCAAT GTTGCTGCAA 
GTATCGCCGA GCTAGTCAAC CACCCTCACA . 
CAGTTCTTGG AC CAGGAGT G CAAGTGACTG 
AGGCTGTGAT CAAGGAGGCA CTTCGTCTCC 
TGAACCTTCA CGACGCAAAG CTTGGCGGGT 
TTAACGCTTG GT GGTTAGCT AACAACCCGG 
CCGAGAGGTT CTTT GAAGAG GAGAAGC AT G 
TTCCGTTTGG CGTTGGTAGG AGGAGCTGCC 
GCATCACTTT GGGACGTTTG GTT CAGAACT 
AGCT CGACAC CACAGAGAAA GGT GGACAGT 
TTGTGTTGAA ACCAAGGTCT TTCTGAACTT 
AAATTTGATA GT GTT GGATA TTAAGGGCGA 



FKLPPGPIPV PVFGNWLQVG DDLNHRNLTD 
LHTQGVEFGS RTRNWFDIF TGKGQDMVFT 
EFEVASVIED VKKNPESATN GIVLRRRLQL 
GERSRLAQSF EYNYGDFIPI LRPFLRGYLK 
DSNALKCAID HILEAQQKGE INEDNVLYIV 
KLRDEI DTVL GPGVQVTEPD THKLPYLQAV 
PAESKILVNA WWLANNPAHW KKPEEFRPER 
IIALPILGIT LGRLVQNFEL LPPPGQSKLD 
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NAME D129-AD10 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 187 

1 CAACACGCTT ACTATCTCCT AAATCTCCAC TCAAAAACAA AGAAGAGAAA GATTTAAAAC 

61 TAATAATTAT G AAAGAG AT G GTGCAAAACA ATATGAGCAC TTCTCTTCTT GAAACTTTAC 

121 AAGCTACGCC CAT GAT ATT C TACTTCATCG TCCCTCTCTT CTGCTTATTC CTTCTCTCCA 

181 AATCTCGCCG TAAACGTTTG CCTCCAGGTC CAACTGGCTG GCCTCTCATT GGTAACATGA 

241 TGATGATGGA CCAGTTAACT CACCGTGGCC TTGCCAAACT AGCCCAAAAA TATGGTGGTG 

301 TTTTTCACCT TAAAATGGGT TATGTTCACA AAATT GTAGT CTCTGGTCCA GAC GAAGCT C 

361 GC CAAGTATT ACAGGAACAC GACAT CAT AT TTTCGAACCG TCCAGCGACC GTAGC CAT AA 

421 GTTACCTAAC ATATGACAGG GCAGACAT GG CTTTTGCTGA CTATGGACTC TTCTGGCGGC 

4 81 AG AT G AGATHA ACTATGTGTA AT GAAACT CT TCAGCCGCAA AC GAGCT GAG T CAT GGGACT 

541 CAGTTCGAGA CGAAGCGGAT TCCATGGTTA GAATT GTAAC AACCAACACA GGCACAGCTG 

601 TTAACTTAGG TGAACTTGTT TTCAGTCTCA CTCGTAATAT TATCTACAGA GCTGCTTTTG 

661 GAACTTGTTC TGAAGATGGA CAAGGCGAGT TCATTGAAAT TAT GCAAGAG TTTTCGAAGC 

721 TATTTGGCGC TTTCAATATA GCTGATTTTA TTCCATGGCT AGGGTGGGTT GGTAAGCAGA 

7 81 GTCTAAATAT TAGACTT GCT AAGGCTAGAG CGTCGCTTGA TGGGTTCATT GATTCGATTA 

8 41 TT GAT GAC CA TATTATTAGA AAGAAAGCTT AT GTTAAT GG CAAAAATGAT GGAGGT GAT C 
901 GAGAAACTGA TATGGTGGAT GAGCTTTTAG CTTTTTACAG TGAGGAAGCA AAAGTAACTG 
961 AGTCCGAAGA TTT GCAGAAT GCTAT CAGAC TTACTAAGGA TAGTAT CAAA G C TAT CAT CA 

1021 TGGATGTAAT GTTT GGAGGG ACAGAAACAG TGGCTTCTGC AATAGAATGG G C CAT GG C AG 

1081 AGCT TAT GAG GAGTCCTGAA GAT CT T AAAA AAGTACAACA AGGGCT GGCT AACGTTGTTG 

1141 GACTCAACAG AAAAGTTGAA GAATCTGACT TTGAAAAATT AACATACTTA AGATGTTGTC 

12 01 TAAAAGAAAC TCTACGACTT CACCCTCCAA TCCCTCTCCT CCTCCATGAG ACCGCCGAGG 

12 61 AATCCACCGT CTCCGGCTAC CATATTCCGG CAAAGTCACA T GTT AT TATA AATTCATTTG 
1321 CCATTGGGCG TGACAAAAAT T CAT GGGAAG AT C CT GAAAC TTATAAACCA TCTAGGTTTC 

13 81 TCAAAGAAGG TGTACCAGAT TTTAAAGGAG GTAATTTTGA GTT TAT AC CA TTTGGGTCGG 

14 41 GTCGGCGGTC TTGCCCCGGT AT G CAACTT G GGCTTTATGC AT T GGAAAT G GCTGTGGCCC 

15 01 ATCTTCTTCA TTGTTTTACT TGGGAATTGC CAGAT GGTAT GAAACCAAGT GAGCTTAAAA 
1561 TGGATGATAT TTTTGGACTC ACTGCTCCAA GAGCTAATCG ACTCGTGGCT GTGCCTACTC 
1621 CACGCTTGTT GTGTCCCCTT TATTAATTGA AGAAAAAAGG TGGGGCT 

SEQ. ID. NO. 188 

1 MKEMVQNNMS TSLLETLQAT PMIFYFIVPL FCLFLLSKSR RKRLPPGPTG WPLIGNMMMM 
61 DQLTHRGIAK LAQKYGGVFH LKMGYVHKIV VSGPDEARQV LQEHDIIFSN RPATVAISYL 
121 TYDRADMAFA DYGLFWRQMR KLCVMKLFSR KRAESWDSVR DEADSMVRIV TTNTGTAVNL 
181 GELVFSLTRN IIYRAAFGTC SEDGQGEFIE IMQEFSKLFG AFNIADFIPW LGWVGKQSLN 
241 IRLAKARASL DGFIDSIIDD HIIRKKAYVN GKNDGGDRET DMVDELLAFY SEEAKVTESE 
3 01 DLQNAI RLTK DSIKAIIMDV MFGGTETVAS AIEWAMAELM RSPEDLKKVQ QGLANWGLN 

3 61 RKVEESDFEK LTYLRCCLKE TLRLHPPIPL LLHETAEEST VSGYHIPAKS HVIINSFAIG 
421 RDKNSWEDPE TYKPSRFLKE GVPDFKGGNF EFIPFGSGRR SCPGMQLGLY ALEMAVAHLIi 

4 81 HCFTWELPDG MKPSELKMDD I FGLTAP RAN RLVAVPTPRL LCPLY 
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NAME D135-AE1 
ORGANISM NICOTIANA 
SEQ. ID. NO. 18 9 

1 GGGGGATAAG 
61 AATTATCTTT 
121 AGAATT GGAG 
181 TGGGGATATG 
241 CTCTCATGAC 
301 TGGTAAGAAT 
361 ACTT GTAAAG 
421 C ACAAAAT T G 
481 AAGGCTTCTC 
541 ATTTACTGCT 
601 GATAGATGTG 
661 TGGCAGTAGT 
721 AATT CTAGAA 

7 81 GAACAAAAGG 

8 41 GAAAAG AT T G 
901 GGCATCCAAT 
961 TGATGAGGTG 

1021 TTTACTTGTA 
1081 AGAAGAGGTT 
1141 AAAAATAGTA 
12 01 TAATCGAATG 
1261 ACTCTTGTTA 
1321 GGAGTT CAAT 
1381 GTTCTTTCCA 
1441 GGCCAAGATG 
1501 TGCTCATGCT 
15 61 CAAGTT GTAG 

SEQ . ID. NO. 190 

1 MEIPYYSLKL 
61 EIKKMGEEAM 
121 VLTKNFVYQK 
181 EMLSKLEKW 
241 SRTIYIPGWR 
301 KEIQQHGNNK 
361 QVFGSREVDY 
421 TILLQHDTEI 
481 MAMILKNYAF 



TABACUM 



AATAT GGAGA 
GTACTAAGAT 
AAATGCATCA 
AAAGAGATAA 
ATGATTTGGC 
TGTTTTGTGT 
GAGGT GCTAA 
GCAGCAACT G 
AATCCTGCTT 
TGTGAGATGT 
TGGCCATATC 
TAT GAAG AAG 
GTGTCCCGCA 
AT GAAGCAAA 
AGTATGATTG 
TTAAAAGAAA 
ATT GAAGAGT 
TGGACTATGA 
TTGCAAGTGT 
ACTAT GAT CT 
GTAACCAAAG 
CCAACAATTT 
CCAGAGAGAT 
TTTAGTTGGG 
GCAAT GGCTA 
CCTCATCCAC 
AAATGGTCAA 



TACCATATTA 
GGGCATGGAA 
GACAGCAGGG 
AGAAAATGGG 
CTAGAGT CAT 
GGTTTGGGCC 
CGAAGAATTT 
GAATTGCAGG 
TTCACCTTGA 
TGAGCAAATT 
TACAAACTTT 
GAAGAAAGCT 
CAATATACAT 
TATTTAATGA 
AAAAT GGAGA 
TCCAACAACA 
GTAAACTCTT 
TTTTGTTGTG 
TTGGAAGTAG 
TAAACGAGGT 
AAACAAAGTT 
TGTTGCAACA 
TTAGTGAT GG 
GTCCAAGAAT 
TGATTCTGAA 
TACTACTTCA 
TTTGGAACTT 



CAGCTTAAAA 
AATCTTGAAT 
TTTCAAAGGA 
TGAAGAAGCT 
GCCCTTCATC 
AAGACCAGCA 
CGTTTATCAG 
CTAT GAAACA 
CAAGTT GAAG 
GGAGAAAGTT 
AACAAGT GAT 
TTTTGAACTT 
CCCAGGATGG 
AGTACGAGCG 
AGCTCCTGAT 
TGGAAATAAC 
CTATTTTGCG 
CAAACAT C CT 
GGAAGTTGAC 
CTTAAGGTTG 
AGGGAATTTA 
T GAT ACT GAA 
AATAT CCAAA 
AT GTATTGGG 
AAACTAT GCA 
ACCTCAATAT 
GTTAT GGAAC 



CTTACAATTT 
TATGTGTGGT 
AACTCTTACA 
ATGTCTAAGC 
CACAAAACCA 
GT C CT GAT CA 
AAGCCACCTG 
GATAAAT GGG 
CATAT GCTAC 
GTCTCACCAA 
GCCATTTCAA 
CAAAAGGAAC 
AGGTTTTTGC 
CTGGTATTGG 
GATTTATTGG 
AAGAAATTTG 
GGGCAAGAGA 
AGTTGGCAAG 
TAT GACAAGT 
T AT CCAGCAG 
TGTTTACCAG 
ATAT GGGGAG 
GCAACAAAAG 
CAAAATTTTG 
TTTGAACTCT 
GGTGCTCAAT 
TT TTAT CAT C 



TTTCATTTGC 
TAAAACCAAA 
AATTCTTGTT 
CAAT CAATTT 
TCACCAATTA 
CAGACCCGGA 
GCACTCCACT 
CTACACATAG 
CTGCATTCCA 
AT GGAACAGA 
GAACTGCTTT 
AACTTTCACT 
CAACAAAAAG 
GAATTATTAA 
GT AT AT TAT T 
GT AT GAGT AT 
CAACTTCATC 
ATAAAGCTAG 
TGAATCAGCT 
GATAT GCGAT 
CT GGGGTACA 
AT GAT GCAAT 
GAAAACTTGT 
CTAT GTTAGA 
CT C CAT CT TA 
TAATTTT GTA 
GTAAT CAACC 



TIFSFAIIFV 
SKPINFSHDM 
PPGTPLTKLA 
SPNGTEIDVW 
FLPTKRNKRM 
KFGMS I DEVI 
DKLNQLKIVT 
WGDDAMEFNP 
ELSPSYAHAP 



LRWAWKILNY 
IWPRVMPFIH 
ATGIAGYETD 
PYLQTLTSDA 
KQIFNEVRAL 
EECKLFYFAG 
MILNEVLRLY 
ERFSDGISKA 
HPLLLQPQYG 



VWLKPKELEK 
KTITNYGKNC 
KWATHRRLLN 
ISRTAFGSSY 
VLGIIKKRLS 
QETTSSLLVW 
PAGYAINRMV 
TKGKLVFFPF 
AQLILYKL 



CIRQQGFKGN 
FVWFGPRPAV 
PAFHLDKLKH 
EEGRKLFELQ 
MIENGEAPDD 
TMILLCKHPS 
TKETKLGNLC 
SWGPRICIGQ 



SYKFLFGDMK 
LITDPELVKE 
MLPAFQFTAC 
KEQLSLILEV 
LLGILLASNL 
WQDKAREEVL 
LPAGVQLLLP 
NFAMLEAKMA 
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NAME D141-AD7 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 191 AAAATGGAGA TTCAGTTTTC TAACTTAGTT GCATTCTTGC TCTTTCTCTC 

61 SgStcSt CTTCTATTCA AAAAATGGAA AACCAGAAAA ctaaatttgc ctcctggtcc 
121 Sggaaatta ccttttattg gaagtttaca ccatttggct gtggcaggtc cacttcctca 
III SSt^gccta aaaaatttag ccaaacgcta tggtcctctt atgcatttac aacttggaca 
III SStccSSa c^catcatat catcacctca aatggcaaaa gaagtactaa aaactcacga 
301 c^?SgStS SSSagac caaagcttgt cgtggccgac atcattcact acgacagcac 
3 61 gXcatagca ttttctccgt acggtgaata ctggagacaa attcgtaaaa tttgcatatt 
421 gSS?ctS aotgccaaga tggtcaaatt ttttagctcg attcgccaag atgagctctc 
ill ga?gatgctc tXtctatac gaacgacacc caatcttaca gtcaatctta ctgacaaaat 
541 S^SSS ISagttcgg taacttgtag atcagcttta gggaagatat gtggtgacca 
601 agacaaattg atcattttta t-gagggaaat aatatcattg gcaggtggat ttagtattgc 

661 £SS5!Sc CCTACATGGA AAATGATTCA TGATATTGAT GGTTCGAAAT CTAAACTGGT 
721 GAAAGCACAT CGTAAGATTG AT GAAATTTT GGGAAATGTT GTTGATGAGC ACAAAAAGAA 
781 CAG^GCAGAT GGCAAGAAGG GTAATGGT GA ATTTGGTGGT GAAGATTTGA TTGATGTATT 
841 cttaaSctt AGAGAAAGTG GAGAAGTTCA AATTCCTATC ACAAATGACA ATATCAAATC 
HI AAT^TTAATC GACATGTTCT CTGCGGGATC TGAAACATCA TCGACGACTA TAATTTGGGC 

961 mtagctgaa ATGATGAAGA AACCAAGTGT tttagcaaag gcacaagctg aagtaaggca 
loll agS?tgaag gagaaaaaag gttttcaaca gattgatctt gatgagctaa aatatctcaa 

1081 GTTAGTAATC AAAGAAACCT TAAGAATGCA CCCTCCAATT CCTCTATTAG TTCCTAGAGA 
llll Sgtatggag GATACAAAGA TTGATGGTTA CAATATACCT TTCAAAACAA GAGTCATAGT 
llol JIatgStgg GCAATCGGAC gagatccaga aagttgggat gaccccgaaa GCTTTATGCC 

llll AGAGAGATTT GAGAATAGTT CTATT GACTT TCTTGGAAAT CAT GAT CAGT TTATACCATT 
llll JgSScaggI AGAAGGATTT GTCCGGGAAT GCTATTTGGT ttagctaatg TTGGACAACC 
llli tSgScag" t^?ttItc ACTTCGATTG GAAACTCCCT AATGGACAAA GTCATGAGAA 
llll TTTCGACATG ACTGAGTCAC CTGGAATTTC TGCTACAAGA AAGGATGATC TTGTTTTGAT 
llol SccSSS TATGATTCTT ATTAAGCAGT AGCAGAAATA AAAAGCCGGG GCAAACAGAA 
1561 AAAAGT 

SEQ. ID. NQ^192 FLLFLSSIFL LFKKWKTRKL NLPPGPWKLP FIGSLHHIAV AGPLPHHGLK 

61 NLAKFLYGPLM HLQLGQIPTL IISSPQMAKE VLKTHDIAFA TRPKLWADI IHYDSTDIAF 
121 SP^EWRQI RKICILELLS AKMVKFFS S I RQDELSKMLS SIRTTPNLTV NLTDKI FWFT 
181 ss^cSalg kicgdqdkli IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
III kiSSgnw dehkkSradg KKGNGEFGGE DLIDVLLRVR esgevqipit ndniksilid 

301 MFSAGSETSS TTIIWALAEM MKKPSVLAKA QAEVRQALKE KKGFQQIDLD ELKYLKLVIK 

III SlrmSpip llvprecmed tkidgynipf ktrvivnawa igrdpeswdd PESFMPERFE 
421 nssSflgnh hqfipfgagr RICPGMLFGL anvgqpiaql LYHFDWKLPN GQSHENFDMT 
481 ESPGISATRK DDLVLIATPY DSY 
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NAME D147-AD3 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. N0^193 ACACATTGAG TCCTCTCCCA AATCACTGAT TCACCACCAA AAGTACCAAC 

61 AATTCAATGG AAGGTACAAA CTTGACTACA TATGCAGCAG TATTTCTTGA TACTCTGTTT 
121 CTTTTGTTCC TTTCCAAACT TCTTCGCCAG AGGAAACTCA ATTTACCTCC AGGCCCAAAA 
181 CCATGGCCGA TCATCGGAAA CTTAAACCTT ATTGGCAATC TTCCTCATCG CTCAATCCAC 
241 GAACTCTCCC T CAAGTACGG ACCCGTTATG CAACT CCAAT TCGGGTCTTT CCCCGTTGTA 
301 GTTGGATCCT CCGTCGAAAT GGCTAAGATT TTCCTCAAAT CCATGGATAT TAACTTTGTA 
361 GGCAGGCCTA AAACGGCTGC CGGAAAATAC ACAACGTACA ATTATTCCGA TATTACATGG 
421 TCTCCTTACG GACCATATTG GCGCCAGGCA CGTAGGATGT GCCTAACGGA ATTATTCAGC 
481 ACGAAACGTC TCGATTCATA CGAGTATATT C GGGCT GAGG AGTTGCATTC TCTTCTCCAT 
541 AATTT GAACA AAATATCAGG GAAACCAATT GTGTTGAAAG ATTATTCGAC GACGTTGAGT 
601 TTAAATGTTA TTAGCAGGAT GGTACTGGGG AAAAGGTATT TGGACGAATC CGAGAACTCG 
661 TTCGTGAATC CTGAGGAATT TAAGAAGAT G TT GGACGAAT TGTTTTTGCT AAATGGTGTA 
721 CTTAATATTG GAGATT CAAT TCCATGGATT GATTT CATGG ATTTGCAAGG TTATGTTAAG 
7 81 AGGATGAAAG T AGT GAGCAA GAAATTCGAC AAGTTTTTAG AGCATGTTAT T GAT GAGCAT 
841 AACATTAGGA GAAATGGAGT GGAGAATTAT GTTGCTAAGG ATATGGTGGA TGTTTTGTTG 
901 CAGCTCGCTG AT GATCCGAA GTT GGAAGTT AAGCT GGAGA GACATGGAGT CAAAGCATTC 
961 ACTCAGGATA TGCTGGCTGG TGGAACCGAG AGT.T C AG CAG TGACAGTGGA GTGGGCAATT 
1021 TCAGAGCTGC TAAAGAAGCC GGAGATTTTC AAAAAGGCTA CAGAAGAATT GGATCGAGTA 
1081 ATTGGGCAGA ATAGATGGGT ACAAGAAAAG GACATTCCAA ATCTTCCTTA CATAGAGGCA 
1141 ATAGTCAAAG AGACTATGCG ACTGCACCCC GTGGCACCAA TGTTGGTGCC ACGTGAGTGT 
1201 CGAGAAGATA TTAAGGTAGC AGGCTACGAC GTT CAGAAAG GAACTAGGGT TCTCGTGAGT 
1261 GTATGGACTA TTGGAAGAGA CCCTACATT G TGGGACGAGC CTGAGGTGTT CAAGCCGGAG 
1321 AGATTCCATG AAAGGTCCAT AGAT GTTAAA GGACATGATT ATGAGCTTTT GCCATTTGGA 
1381 GCGGGGAGAA GAATGTGCCC GGGTTATAGC TTGGGGCTCA AGGTGATTCA AGCTAGCTTA 
1441 GCTAATCTTC TACATGGATT TAACTGGTCA TTGCCTGATA ATATGACTCC TGAGGACCTC 
1501 AACATGGATG AGATTTTTGG GCTCTCTACA CCTAAAAAAT TTCCACTTGC TACTGTGATT 
1561 GAGCCAAGAC TTTCACCAAA ACTTTACTCT GTTTGATTCA GCAGTTCTAT GGTTCCGTCA 
1621 AGATAGACTT TGTTACGTTT GAACCTGTGC TC 

SEQ. ID. NO- 194 AVFLDTLFLL FLSKLLRQRK LNLPPGPKPW PIIGNLNLIG NLPHRSIHEL 

61 SLKYGPVMQL QFGSFPVWG SSVEMAKIFL KSMDINFVGR PKTAAGKYTT YNYSDITWSP 
121 YGPYWRQARR MCLTELFSTK RLDSYEYIRA EELHSLLHNL NKISGKPIVL KDYSTTLSLN 
181 VI S BMVL GKR YLDESENSFV NPEEFKKMLD ELFLLNGVLN IGDSIPWIDF MDLQGYVKRM 
241 KVVS KKFDKF LEHVI DEHNI RRNGVENYVA KDMVDVLLQL ADDPKLEVKL ERHGVKAFTQ 
301 DMLAGGTESS AVTVEWAISE LLKKPEI FKK ATEELDRVIG QNRWQEKDI PNLPYIEAIV 
361 KETMRLHPVA PMLVP RECRE DI KVAGYDVQ KGTRVLVSVW TIGRDPTLWD EPEVFKPERF 
421 JSIrsidvkgh DYELLPFGAG RRMCPGYSLG LKVIQASLAN LLHGFNWSLP DNMTPEDLNM 
481 DEIFGLSTPK KFPLATVIEP RLSPKLYSV 
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NAME D163-AF12 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 195 

1 CTTCTTCCTT CCTAACTAAA AATGGAGATT CAGTTTTCTA ACTTAGTTGC ATTCTTGCTC 
61 TTTCTCTCCA GCATCTTTCT TGTATTCAAA AAATGGAAAA CCAGAAAACT AAATTTGCCT 
121 CCTGGTCCAT GGAAAT TACC TTTTATTGGA AGTTTACACC ATTTGGCTGT GGCAGGTCCA 
181 CTTCCTCACC ATGGCCTAAA AAATTTAGCC AAACGCTATG GTCCTCTTAT GCATTTACAA 
241 CTTGGACAAA TTCCTACACT C GT CAT AT CA TCACCTCAAA TGGCAAAAGA AGTACTAAAA 
301 ACTCACGACC TCGCTTTTGC CACTAGACCA AAGCTTGTCG TGGCCGACAT CATTCACTAC 
361 GACAGCACGG ACATAGCATT TTCGCCATAC GGT GAATACT GGAGACAAAT TCGTAAAATT 
421 TGCATATTGG AACTCTTGAG T GCCAAGAT G GTCAAGTTTT TTAGCT C GAT TCGCCAAGAT 
481 GAGCTCTCGA AGATGGTTTC AT CTATACGA ACGACGCCCA ATCTTCCAGT CAAT CTTACC 
541 GACAAGATTT TTTGGTTTAC GAGTTCGGTA ATTT GTAGAT CAGCTTTAGG GAAGAT AT GT 
601 GGT GACCAAG ACAAATTGAT CATTTTTATG AGGGAAATAA TATCATTGGC AGGTGGATTT 
661 AGTATTGCTG ATTTTTTCCC TACAT GGAAA AT GATT CAT G ATATTGATGG TTCAAAATCT 
721 AAACTGGTGA AGGCACATCG TAAGATT GAT GAAATTTTGG AAAAT GT GGT AAAT GAGCAC 
781 AAAC AG AAT C GAGCAGATGG TAAAAAGGGT AATGGTGAAT TTGGTGGAGA AGATCTGATT 
841 GATGTTTTGT TAAGAGTTAG AGAAAGT GGA GAAGTTCAAA TTCCAATCAC AGAT GAC AAT 
901 AT C AAAT C AA T ATT AAT C GA CATGTTCTCT GCCGGATCGG AAAC AT CATC GACAACTATA 
961 ATTT GGGCAT TAGCT GAAAT GAT GAAG AAA CCAAGTGTTT TAGCAAAGGC ACAAGCT GAA 
1021 GTGAGGCAAG CTTTGAAGGG GAAGAAAATT AGTTTTCAAG AGATTGATAT TGATAAGCTA 
1081 AAGTATTTGA AGTTAGTGAT CAAAGAAACT TTAAGAATGC ACCCTCCAAT TCCTCTGTTA 
1141 GTCCCTAGAG AAT GT AT GGA AGATACAAAG ATT GAT GGT T ACAATATACC TTT CAAAACA 
1201 AGAGTCATTG TTAATGCATG GGCAATTGGA CGAGATCCTC AAAGTTGGGA TGATCCTGAA 
12 61 AGCTTTACGC CAGAGAGATT T GAGAAT AAT T CTATT GATT TTCTTGGAAA T CAT CAT CAA 
1321 TTTATTCCAT TTGGTGCAGG AAGAAGGATT TGTCCTGGAA TGCTATTTGG TTTAGCTAAT 
1381 GTTGGACAAC CTTTAGCTCA GTTACTTTAT CACTTCGATT GGAAACTCCC TAAT GGACAA 
1441 AGT CAT GAGA ATTTCGACAT GACTGAGTCA CCTGGAATTT CTGCTACAAG AAAGGAT GAT 
1501 CTTGTTTTGA TTGCCACTCC T TAT GATT CT TATTAAGCAG TAG C AGAAAT AAAAAGCCGG 
1561 GGCAAACAGA AAAAAGTATT GCTGCTTCTA GGTATTTT CT ATT GGATAAA TTTCAAAATT 
1621 CAT C CAC AAT ATTTAGTGTT TGCTAGAGTT GGTTAGC 

SEQ. ID. NO. 196 

1 MEIQFSNLVA FLLFLSSIFL VFKKWKTRKL NLPPGPWKLP FIGSLKHLAV AGPLPHHGLK 
61 NLAKRYGPLM HLQLGQIPTL VISSPQMAKE VLKTHDLAFA T RP KL WAD I IHYDSTDIAF 
121 SPYGEYWRQI RKICILELLS AKMVKFFS S I RQDELSKMVS SIRTTPNLPV NLTDKIFWFT 
181 SSVICRSALG KICGDQDKLI IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
241 KIDEILENW NEHKQNRADG KKGNGEFGGE DLIDVLLRVR ESGEVQIPIT DDNIKSILID 
301 MFSAGSETSS TTIIWALAEM MKKP SVLAKA QAEVRQALKG KKISFQEIDI DKLKYLKLVI 
361 KETLRMHPPI PLLVPRECME DTKIDGYNIP FKTRVIVNAW AIGRDPQSWD DPESFTPERF 
421 ENNSIDFLGN HHQFIPFGAG RRICPGMLFG LANVGQ P LAQ LLYHFDWKLP NGQSHENFDM 
481 TESPGISATR KDDLVLIATP YDSY 
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NAME D163-AG11 
ORGANISM N I COT I ANA TABACUM 

SEQ . ID. NO. 197 

1 CTTCTTCCTT CCTAACTAAA AAT GGAGATT CAGTTTTCTA ACTTAGTTGC ATTCTTGCTC 

61 TTTCTCTCCA GCATCTTTCT TGTATTCAAA AAATGGAAAA CCAGAAAACT AAATTTGCCT 

121 CCTGGTCCAT GGAAATTACC TTTTATTGGA AGT TTACACC ATTTGGCTGT GGCAGGTCCA 

181 CTTCCTCACC ATGGCCTAAA AAATTTAGCC AAACGCTATG GTCCTCTTAT GCATTTACAA 

241 CTTGGACAAA TTCCTACACT C GT CAT AT CA TCACCTCAAA TGGCAAAAGA AGTACTAAAA 

301 ACTCACGACC TCGCTTTTGC CACTAGACCA AAGCTTGTCG TGGCCGACAT CATT CACTAC 

361 GACAGCACGG ACATAGCACT TTCGCCATAC GGTGAATACT GGAGACAAAT TCGTAAAATT 

421 TGCATATTGG AACTCTTGAG TGCCAAGATG GTCAAGTTTT TTAGCTCGAT TCGCCAAGAT 

481 GAGCTCTCGA AGATGGTTTC AT CTATACGA ACGACGCCCA ATCTTCCAGT CAAT CTTACC 

541 GACAAGATTT TTTGGTTTAC GAGTTCGGTA ATTTGTAGAT CAGCTTTAGG GAAGAT AT GT 

601 GGT GACCAAG ACAAATTGAT CATTTTTATG AGGGAAATAA TATCATTGGC AGGTGGATTT 

661 AGTATTGCTG ATTTTTTCCC TACATGGAAA AT GATT CAT G ATATTGATGG TTCAAAATCT 

721 AAACTGGTGA AGGCACATCG TAAGATT GAT GAAATTTTGG AAAATGTGGT AAAT GAG C AC 

781 AAACAGAATC GAGCAGATGG TAAAAAGGGT AATGGTGAAT TTGGTGGAGA AGATCT GATT 

841 GATGTTTTGT TAAGAGTTAG AGAAAGT G GA GAAGTT CAAA TTCCAATCAC AGAT GACAAT 

901 AT C AAAT C AA TATTAAT CGA CATGTTCTCT GCCGGATCGG AAA CAT CATC GACAACTATA 

961 ATTTGGGCAT TAGCTGAAAT GAT GAAGAAA CCAAGTGTTT TAGCAAAGGC ACAAGCTGAA 

1021 GT GAG CCAAG CTTTGAAGGG GAAGAAAATT AGTTTTCAAG AGAT T GAT AT TGATAAGCTA 

1081 AAGTATTTGA AGTTAGT GAT CAAAGAAACT TTAAGAATGC ACCCTCCAAT TCCTCTGTTA 

1141 GTCCCTAGAG AATGTATGGA AGATACAAAG ATTGATGGTT AC AAT AT AC C TTTCAAAACA 

1201 AGAGT CATT G TTAATGCATG GGCAATTGGA CGAGATCCTC AAAGTT GGGA TGATCCTGAA 

12 61 AGCTTTACGC CAGAGAGATT TGAGAATAAT T CT ATT GATT TTCTTGGAAA T CAT CAT CAA 

1321 TTTATTCCAT TTGGTGCAGG AAGAAGGATT TGTCCTGGAA TGCTATTTGG TTTAGCTAAT 

1381 GT T GGACAAC CTTTAGCTCA GTTACTTTAT CACTTCGATT GGAAACTCCC T AAT G G AC AA 

1441 ACT CAC CAAA ATTTCGACAT GACTGAGTCA CCTGGAATTT CTGCTACAAG AAAGGAT GAT 

1501 CTTATTTTGA TTGCCACTCC TGCTCATTCT TGATTAAGTA TTGCTGCTTT TCTATTGGAG 

1561 AATTTTCAAA ATTCATCCAC AATATATAGT GTTTGCTAGA GTTGGTTAGC 

SEQ. ID. NO. 198 

1 MEIQFSNLVA FLLFLSSIFL VFKKWKTRKL NLPPGPWKLP FIGSLHHLAV AGPLPHHGLK 
61 NLAKRYGPLM HLQLGQIPTL VISSPQMAKE VLKTHDLAFA TRPKLWADI IHYDSTDIAL 
121 SPYGEYWRQI RKICILELLS AKMVKFFSSI RQDELSKMVS SIRTTPNLPV NLTDKIFWFT 
181 SSVICRSALG KICGDQDKLI IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
241 KIDEILENW NEHKQNRADG KKGNGEFGGE DLIDVLLRVR ESGEVQIPIT DDNIKSILID 
301 MFSAGSETSS TTIIWALAEM MKKPSVIiAKA QAEVS QALKG KKISFQEIDI DKLKYLKLVI 
361 KETLRMHPPI PLLVPRECME DTKIDGYNIP FKTRVIVNAW AIGRDPQSWD DPESFTPERF 
421 ENNSIDFLGN HHQFIPFGAG RRICPGMLFG LANVGQPLAQ LLYHFDWKLP NGQTHQNFDM 
4 81 TESPGISATR KDDLILIATP AHS 
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NAME D163-AG12 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 199 

1 ATCCTTCTTC CTTCCTAGGT CCTAACTAAA AATGGAGATT CAGTTTTCTA ACTTAGTTGC 
61 ATTCTTGCTC TTTCTCTCCA GCATCTTTCT TCTATTCAAA AAATGGAAAA CCAGAAAACT 
121 AAATTTGCCT CCTGGTCCAT GGAAATTACC TTTTATTGGA AGTTTACACC ATTTGGCTGT 
181 GGCAGGTCCA CTTCCTCACC ATGGCCTAAA AAATTTAGCC AAAC GCTAT G GTCCTCTTAT 
241 GCATTTACAA CTTGGACAAA TTCCTACACT CAT CAT AT C A T C AC CT C AAA TGGCAAAAGA 
301 AGTACTAAAA ACTCACGACC TCGCTTTTGC CACTAGACCA AAGCTTGTCG TGGCCGACAT 
361 CAT T CACT AC GACAGCACGG ACATAGCATT TTCTCCGTAC GGT GAATACT GGAGACAAAT 
421 TCGTAAAATT TGCATATTGG AACTCTTGAG TGCCAAGATG GTCAAATTTT TTAGCTCGAT 
481 TCGCCAAGAT GAGCTCTCGA AGATGCTCTC ATCTATACGA ACGACACCCA AT CTTACAGT 
541 CAAT CTTACT GACAAAATTT TTTGGTTTAC GAGTTCGGTA ACTT GTAGAT CAGCTTTAGG 
601 GAAGATATGT GGT GACCAAG ACAAATTGAT CATTTTTATG AGGGAAATAA TATCATTGGC 
661 AGGTGGATTT AGTATTGCTG ATTTTTTCCC TACAT GGAAA ATGATT CAT G ATATTGATGG 
721 TTCGAAATCT AAACTGGTGA AAGCACATCG TAAGATT GAT GAAATTTTGG GAAATGTTGT 
781 T GAT GAGCAC AAAAAGAACA GAGCAGATGG CAAGAAGGGT AATGGTGAAT TTGGTGGTGA 
841 AGATTTGATT GATGTATTGT TAAGAGTTAG AGAAAGT GGA GAAGT T C AAA TTCCTATCAC 
901 AAATGACAAT ATCAAATCAA TATTAATCGA CATGTTCTCT GCGGGATCTG AAAC AT CAT C 
961 GACGACTATA ATTTGGGCAT TAGCTGAAAT GATGAAGAAA CCAAGTGTTT TAGCAAAGGC 
1021 ACAAGCT GAA GTAAGGCAAG CTTTGAAGGA GAAAAAAGGT TTT CAACAGA TTGATCTTGA 
1081 TGAGCTAAAA TATCTCAAGT TAGTAATCAA AGAAACCTTA AGAAT GCACC CTCCAATTCC 
1141 TCTATTAGTT CCTAGAGAAT GTATGGAGGA TACAAAGATT GAT GGT TACA ATAT AC CTTT 
1201 CAAAACAAGA GTCATAGTTA ATGCATGGGC AAT CGGACGA GAT C C AG AAA GTTGGGATGA 
1261 CCCCGAAAGC TTTATGCCAG AGAGATTTGA GAATAGTT CT ATTGACTTTC TTGGAAATCA 
1321 TCATCAGTTT AT AC CAT TT G GTGCAGGAAG AAGGATTTGT CCGGGAATGC TATTTGGTTT 
1381 AGCTAATGTT GGACAACCTT TAGCT CAGTT ACTTTATCAC TTCGATTGGA AACTCCCTAA 
1441 TGGACAAAGT CAT G AGAAT T TCGACATGAC TGAGTCACCT GGAATTTCTG CTACAAGAAA 
1501 GGATGATCTT GTTTTGATTG CCACTCCTTA TGATTCTTAT TAAGCAGTAG CAGAAATAAA 
1561 AAGCCGGGGC AAACAGAAAA AAGTATTGCT GCTTCTAGGT ATTTTCTATT GGATAAATTT 
1621 CAAAATTCAT C CACAATATT TAGTGTTTGC TAGAGTT GGT TAGC 

SEQ. ID. NO. 200 

1 MEIQFSNLVA FLLFL3SIFL LFKKWKTRKL NLPPGPWKLP FIGSLHHLAV AGPLPHHGLK 
61 NLAKRYGPLM HLQLGQIPTL IISSPQMAKE VLKTHDLAFA TRPKLWADI IHYDSTDIAF 
121 SPYGEYWRQI RKICILELLS AKMVKFFSSI RQDELSKMLS SIRTTPNLTV NLTDKIFWFT 
181 SSVTCRSALG KICGDQDKLI IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 
241 KIDEILGNW DEHKKNRADG KKGNGEFGGE DLIDVLLRVR ESGEVQIPIT NDNIKSILID 
301 MFSAGSETSS TTIIWALAEM MKKPSVLAKA QAEVRQALKE KKGFQQIDLD ELKYLKLVIK 
361 ETLRMHPPIP LLVPRECMED TKIDGYNIPF KTRVIVNAWA IGRDPESWDD PESFMPERFE 
421 NSSIDFLGNH HQFIPFGAGR RICPGMLFGL ANVGQPLAQL LYHFDWKLPN GQSHENFDMT 
481 ESPGISATRK DDLVLIATPY DSY 
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NAME D2 05-BG9 

ORGANISM N I COT I ANA TABACUM 
SEQ . ID. NO. 201 

1 TTCTTATTTT GATT CAACCA TGGAGAACCA ATACTCCTAC TCATTCTCTT CCTACTTCTA 
61 CTTAGCTATA GTACTGTTTC TTCTTCCAAT TTTGGTCAAA TATTTCTTCC ATCGGAGAAG 
121 AAATTTACCT C C AAGT C CAT TTTCTCTTCC AATAATTGGT CACCTTTACC TTCTCAAGAA 
181 AACTCTCCAT CTCACTCTAA CATCCTTATC AGCTAAATAT GGTCCTGTTT TATACCT CAA 
241 ATTGGGCTCT ATGCCTGTGA TTGTTGTGTC CTCACCATCT GCTGTTGAAG AATGTTTAAC 
301 CAAGAATGAT AT CAT ATT C G CAAATAGGCC CAAGACCGTG GCTGGTGACA AGT TTACCTA 
361 CAATTATACT GTTTATGTTT GGGCACCCTA TGGCCAACTT TGGAGAATTC TTCGCCGATT 
421 AACTGTCGTT GAACTCTTCT CTT CACATAG CCTACAGAAA ACTTCTATCC TTAGAGATCA 
4 81 AGAAGTTGCA ATATTTATCC GTTCGTTATA CAAATTCTCA AAGGATAGTA GCAAAAAAGT 
541 CGATTTGACC AACTGGTCTT TTACTTTGGT TTTCAATCTT AT GACCAAAA TTATTGCTGG 
601 GAGACATATT GTGAAGGAGG AAGATGCTGG CAAGGAAAAG GGCATTGAAA TTATT GAAAA 
661 ACTTAGAGGG ACTTTCTTAG TAACTACATC ATTCTTGAAT AT GT GT GATT TCTTGCCAGT 
721 ATT CAGGT GG GTT GGTTACA AAGGGCT GGA G AAGAAG AT G GCCTGAATTC ACAATAGAAG 
781 AAATGAATTC TTGAACAGCT TGCTTGATGA ATTTCGACAC AAGAAAAGTA GTGCTTCACA 
8 41 ATCTAACACA ACTGTTGGAA ACATGGAGAA GAAAA CC AC A CTGATTGAAA AGCTCTTGTC 
901 T CTT CAAGAA TCAGAGCCTG AATTCTACAC T GAT GAT AT C AT C AAAAGT A TTATGCTGGT 
961 AGTTTTTGTT G CAGGAAC AG AGACCTCATC AACAACCATC CAAT GGGTAA TGAGGCTTCT 
1021 TGTAGCTCAC CCT GAGGCAT TGTATAAGCT ACGAGCT GAC ATT GACAGTA AAGTT GGGAA 
1081 TAAGCGCTTG CTGAATGAAT CAGACCTCAA CAAGCTTCCG TATTTGCATT GTGTTGTTAA 
1141 TGAGACAATG AGATTATACA CTCCGATACC ACTT TTATT G CCTCATTATT CAACTAAAGA 
1201 TTGTATTGTG GAAGGATATG AT GTACCAAA ACATACAATG TTGTTTGTCA ACGCTTGGGC 
1261 CATTCACAGG GATCCCAAGG TAT GGGAGGA GCCT GACAAG TTCAAGCCAG AGAGATTTGA 
1321 GGCAACAGAA GGGGAAACAG AAAGGTTCAA TTACAAGCTT GTACCATTTG GAAT GGGGAG 
1381 AAGAGCGTGC CCTGGAGCTG ATATGGGGTT GCGAGCAGTT TCTTTGGCAT TAGGT GCACT 
14 41 T ATT CAAT GC TTTGACTGGC AAATT GAGGA AGCGGAAAGC TTGGAGGAAA GCTATAATTC 
1501 TAGAATGACT AT GCAGAACA AGCCTTTGAA GGTTGTCTGC ACTCCACGCG AAGATCTTGG 
1561 CCAGCTTCTA TCCCAACTCT AAGGCAATTT AT CAAT GCCA AACGTAATCT T CAT CT AC CA 
1621 CTATG 



SEQ . ID. NO. 202 

1 MENQYSYSFS SYFYLAIVLF LLPILVKYFF HRRRNLPPSP FSLPIIGHLY LLKKTLHLTL 
61 TSLSAKYGPV LYLKLGSMPV IWSSPSAVE ECLTKNDIIF ANRPKTVAGD KFTYNYTVYV 
121 WAPYGQLWRI LRRLTWELF SSHSLQKTSI LRDQEVAIFI RSLYKFSKDS SKKVDLTNWS 
181 FTLVFNLMTK IIAGRHIVKE EDAGKEKGIE IIEKLRGTFL VTTSFLNMCD FLPVFRWVGY 
241 KGLEKKMASI HNRRNEFLNS LLDEFRHKKS SASQSNTTVG NMEKKTTLIE KLLSLQESEP 
301 EFYTDDIIKS IMLWFVAGT ETSSTTIQWV MRLLVAHPEA LYKLRADIDS KVGNKRLLNE 
361 SDLNKLPYLH CWNETMRLY TPIPLLLPHY STKDCIVEGY DVPKHTMLFV NAWAIHRDPK 
421 VWEEPDKFKP ERFEATEGET ERFNYKLVPF GMGRRACPGA DMGLRAVSLA LGALIQCFDW 
481 QIEEAESLEE SYNSRMTMQN KPLKWCTPR EDLGQLLSQL 
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NAME D207-AA5 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 203 

1 AACCAACCTT CCTTTTCTTA CTTAGTAAAA TGGATATTCA GTCTTCTCCT TTCAACTTAA 
61 TTGCTTTGCT ACTCTTCATT TCATTTCTTT TTATCCTATT GAAAAAGTGG AATACCAAAA 
121 TCCCAAAGTT ACCTCCAGGT CCAT GGAGAC TTCCCCTTAT TGGCAGCCTC CAT CACTT GA 
181 AAGGTAAACT CCCACACCAT CATCTTAGAG ATTTAGCCCG AAAATAT GGA CCTCTCATGT 
241 ATTTACAACT T GGAGAAGTT CCTGTAGTTG TAATATCTTC GCCACGTATA GCAAAAGCTG 
301 TACTAAAAA.C T CAT GAT CTT GCTTTTGCAA CGAGGCCTCG GTTCATGTCC TCGGACATTG 

3 61 TGTTTTACAA AAGCAGGGAC ATATCATTCG CCCCATATGG CGATTACTGG AGACAAATGC 
421 GTAAAATATT AACACAAGAA CTCTTGAGTA ACAAGAT GCT CAAGT CATTT AGCACAATCC 

4 81 GAAAGGATGA GCTCTCGAAG CTCCTCTCGT CGATTCGTTT AGCAACAGCT TCTTCTGCAG 
541 T GAAC AT AAA CGAAAAGCTT CTCTGGTTTA CAAGTT GCAT GACTTGTAGA TTAGCCTTTG 
601 GAAAAATATG CAACGATCGT GAT GAATT GA TTAT GTTAAT AAGGGAGATA TTAGCATTAT 
661 CAGGAGGATT TGATGTGTGT GATTTGTTCC CTTCATGGAA ATTACTTCAC AATAT GAGCA 
721 ACAT GAAAGC TAGATTGACG AATGTTCACC ATAAGTATAA TCTAATTATG GAGAAT AT C A 
781 TCAAT GAGCA CAAAGAGAAT CAT GCAGCAG GGATAAAGGG AAATAACGAG TTTGGTGGCG 
841 AAGAT AT GAT T GAT GCTTTA CTGAGGGTTA AGGAGAATAA TGAGCTTCAA TTTCCTATCG 
901 AAAAT G AC AA CAT GAAAGCA GTAATTCTGG ACTTGTTTAT TGCTGGAACT GAAACTT CAT 
961 ATACT GCAAT TATATGGGCA CTAT CAGAAT T GAT GAAGCA CCCAAGTGTT ATGGCCAAGG 

1021 CACAAGCTGA AGTGAGAAAA GTCTTCAAAG AAAAT GAAAA CTTGGACGAA AATGATCTTG 
1081 ACAAGTTGCC ATACTTAAAA TCAGTGATCA AAGAAACACT AAGGAT GCAT CCTCCAGTTC 
1141 CTTTATTAGG AC CT AGAGAA TGCAGAGAAC AAACT GAGAT T GAT GGATAT ACTGTACCTC 
1201 TTAAAGCTAG AGTAATGGTT AATGCATGGG CAATT GGAAG AGATCCTGAA AGTT GGGAAG 
1261 ATCCTGAAAG TTTCAAACCC GAGCGATTTG AAAATATTT C TGTTGATCTT ACGGGAAATC 
1321 ACT AT CAGTT CATCCCTTTC GGTTCAGGAA GAAGAATGTG TCCAGGAATG TCGTTTGGTT 

13 81 TAGTTAACAC TGGGCATCCT TTAGCTCAGT TGCTCTATTT CTTTGACTGG AAATTCCCTC 

14 41 ATAAGGTTAA T GC AG CT GAT TTTCACACTA CTGAAACAAG TAGAGTTTTT GCAGCAAGCA 
1501 AAGAT GAC CT CTACTTGATT CCAACAAATC ACAT GGAGC A AGAGTAGCT C TAAATT GAAT 

15 61 TCTTGTCTTG GAACAATAAA AGAAGAAACT CCAGCTTGGT CT AC AT TAT T TCTTTTTGCT 
1621 TTATATTAGT ATGGGTGTGT TCAGTTTCTT ATTTTTAAGG GTACCCT GAA AGATAAAGGG 
1681 CTATATAAAC CAGT GAGACT TTTTATTGGT TGCAAGGTTT TAGATCAAGC CATAAGACAG 
1741 CATATTTTAT TCAAAAAAAA AAAAAAA 

SEQ . ID. NO. 204 

1 MDIQSSPFNL IALLLFI SFL FILLKKWNTK IPKLPPGPWR LPLIGSLHHL KGKLPHHHLR 
61 DLARKYGPLM YLQLGEVPW VISSPRIAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APYGDYWRQM RKILTQELLS NKMLKSFSTI RKDELSKLLS S I RLATAS S A VNINEKLLWF 
181 TSCMTCRLAF GKICNDRDEL IMLIREILAL SGGFDVCDLF PSWKLLHNMS NMKARLTNVH 
241 HKYNLIMENI INEHKENHAA GIKGNNEFGG EDMIDALLRV KENNELQFPI ENDNMKAVIL 

3 01 DLFIAGTETS YTAIIWALSE LMKHP S VMAK AQAEVRKVFK ENENLDENDL DKLPYLKSVI 
361 KETLRMHPPV PLLGPRECRE QTEIDGYTVP LKARVMVNAW AIGRDPESWE DPESFKPERF 
421 ENISVDLTGN HYQFIPFGSG RRMCPGMSFG LVNTGHPLAQ LLYFFDWKFP HKVNAAD FHT 

4 81 TETSRVFAAS KDDLYLIPTN HMEQE 
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NAME D207-AB4 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 205 

1 AACCAACCTT CCTTTTCTTA CTTAGTAAAA TGGATATTCA GTCTTCTCCT TTCAACTTAA 
61 TTGCTTTGCT ACTCTTCATT TCATTTCTTT TTATCCTATT GAAAAAGTGG AATACCAAAA 
121 TCCCAAAGTT ACCTCCAGGT CCAT GGAGAC TTCCCCTTAT TGGCAGCCTC CAT CACTT GA 
181 AAGGTAAACT CCCACACCAT CAT CTTAGAG ATTTAGCCCG AAAATAT GGA CCTCTCATGT 
241 ATTTACAACT TGGAGAAGTT CCTGTAGTTG TAATATCTTC GCCACGTATA GCAAAAGCTG 
301 TACTAAAAAC TCATGATCTT GCTTTTGCAA CGAGGCCTCG GTTCATGTCC TCGGACATTG 
361 TGTTTTACAA AAGCAGGGAC AT AT CAT T C G CCCCATATGG CGATTACT GG AGACAAATGC 
421 GTAAAATATT AACACAAGAA CTCTTGAGTA ACAAGAT GCT CAAGTCATTT AGCACAATCC 
481 GAAAGGAT GG GCTCTCGAAG CTCCTCTCGT CGATTCGTTT AGCAACAGCT TCTTCTGCAG 
541 T GAACAT AAA CGAAAAGCTT CTCTGGTTTA CAAGTT GCAT GACTTGTAGA TTAGCCTTTG 
601 G AAAAAT AT G CAACGAT CGT GAT GAATT GA TTAT GTTAAT AAGGGAGATA TTAGCATTAT 
661 CAGGAGGATT TGATGTGTGT GATTTGTTCC CTT CAT GGAA ATTACTTCAC AATAT GAGCA 
721 ACAT GAAAGC TAGATTGACG AATGTTCACC ATAAGTATAA TCTAATTATG GAG AATAT CA 
781 T CAAT GAGCA CAAAGAGAAT CAT GCAGCAG GGATAAAGGG AAATAACGAG TTTGGTGGCG 
841 AAGAT AT GAT T GAT GCTTTA CTGAGGGTTA AGGAGAATAA TGAGCTTCAA TTTCCTATCG 
901 AAAAT G AC AA CAT GAAAGC A GTAATTCTGG ACTTGTTTAT TGCTGGAACT GAAACTTCAT 
961 ATACT GCAAT TATATGGGCA CT AT CAGAAT T GAT GAAGCA CCCAAGT GTT ATGGCCAAGG 
1021 CACAAGCT GA AGTGAGAAAA GTCTTCAAAG AAAAT GAAAA CTTGGACGAA AATGATCTTG 
1081 ACAAGTTGCC ATACTTAAAA TCAGTGATCA AAGAAACACT AAGGAT GCAT CCTCCAGTTC 
1141 CTTTATTAGG AC CT AGAGAA TGCAGAGAAC AAACT GAGAT T GAT GGATAT ACT GTACCT C 
1201 TTAAAGCTAG AGTAAT GGTT AATGCATGGG CAATT GGAAG AGATCCTGAA AGTT GGGAAG 
1261 AT C CT GAAAG TTTCAAACCC GAGCGATTTG AAAATATTTC T GTT GAT CTT ACGGGAAATC 
1321 ACTATCAGTT CATTCCTTTC GGTTCAGGAA GAAGAATGTG TCCAGGAATG TCGTTTGGTT 
1381 TAGTTAACAC TGGGCATCCT TTAGCT CAGT TGCTCTATTT CTTTGACTGG AAATTCCCTC 
1441 AT AAGGT T AA T GCAGCT GAT TTT CACACTA CT GAAACAAG TAGAGTTTTT GCAGCAAGCA 
1501 AAGAT GACCT CTACTTGATT CCAACAAATC ACAT G GAGCA AGAGTAGCTC TAAATTGAAT 
1561 TCTTGTCTTG GAACGATAAA AGAAGAAACT CCAGCTTGGT CTACATTATT TCTTTTTGCT 
1621 T TAT ATTAGT ATGGGTGTGT TCAGTTTCTT GTTTTTAAGG GTACCCT GAA AGATAAAGGG 
1681 CTATATAAAC CAGT GAGACT TTTTATTGAA AAAAAAAAAA AAAAAAAAAA AAAAAA 



1 MDIQSSPFNL IALLLFISFL FILLKKWNTK IPKLPPGPWR LPLIGSLHHL KGKLPHHHLR 
61 DLARKYGPLM YLQLGEVPW VISSPRIAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APYGDYWRQM RKILTQELLS NKMLKSFSTI RKDELSKLLS SIRLATASSA VNINEKLLWF 
181 TSCMTCRLAF GKICNDRDEL IMLIREILAL SGGFDVCDLF PSWKLLHNM5 NMKARLTNVH 
241 HKYNLIMENI INEHKENHAA GIKGNNEFGG EDMIDALLRV KENNELQFPI ENDNMKAVIL 
301 DLFIAGTETS YTAIIWALSE LMKHPSVMAK AQAEVRKVFK ENENLDENDL DKLPYLKSVI 
361 KETLRMHPPV PLLGPRECRE QTEIDGYTVP LKARVMVNAW AIGRDPESWE DPESFKPERF 
421 ENISVDLTGN HYQFIPFGSG RRMCPGMSFG LVNTGHPLAQ LLYLFDWKFP HKVNAADFHT 
481 TETSRVFAAS KDDLYLIPTN HMEQE 
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NAME D2 07-AC4 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. ^ 0 ^^ ACClrT ccTTTTCTTA CTTAGTAAAA TGGATATTCA GTCTTCTCCT TTCAACTTAA 
61 TTGCTTTGCT ACTCTTCATT TCATTTCTTT TTATCCTATT GAAAAAGTGG AATACCAAAA 
121 TCCCAAAGTT ACCTCCAGGT CCATGGAGAC TTCCCCTTAT TGGCAGCCTC CATCACTTGA 
181 AAGGTAAACT CCCACACCAT CATCTTAGAG ATTTAGCCCG AAAATATGGA CCTCTCATGT 
241 ATTTACAACT TGGAGAAGTT CCTGTAGTTG TAATATCTTC GCCACGTATA GCAAAAGCTG 
301 TACTAAAAAC T CAT GAT CTT GCTTTTGCAA CGAGGCCTCG GTTCATGTCC TCGGACATTG 
361 TGTTTTACAA AAGCAGGGAC ATATCATTCG CCCCATATGG CGATTACTGG AGACAAATGC 
421 GTAAAATATT AACACAAGAA CTCTTGAGTA ACAAGATGCT CAAGTCATTT AGCACAATCC 
481 GAAAGGATGA GCTCTCGAAG CTCCTCTCGT CGATTCGTTT AGCAACAGCT TCTTCTGCAG 
541 T GAACAT AAA CGAAAAGCTT CTCTGGTTTA CAAGTTGCAT GACTT GTAGA TTAGCCTTTG 
601 GAAAAATATG CAACGATCGT GATGAATTGA TTAT GTTAAT AAGGGAGATA TTAGCATTAT 
661 CAGGAGGATT TGATGTGTGT GATTTGTTCC CTT CATGGAA ATTACTTCAC AATATGAGCA 
721 ACAT GAAAGC TAGATTGACG AATGTTCACC ATAAGTATAA TCTAATTATG GAGAATATCA 
781 TCAATGAGCA CAAAGAGAAT CATGCAGCAG GGATAAAGGG AAATAACGAG TTTGGTGGCG 
841 AAGATAT GAT TGATGCTTTA CTGAGGGTTA AGGAGAATAA TGAGCTTCAA TTTCCTATCG 
901 AAAATGACAA CAT GAAAGCA GTAATTCTGG ACTTGTTTAT TGCTGGAACT GAAACTTCAT 
961 ATACTGCAAT TATATGGGCA CTAT CAGAAT TGATGAAGCA CCCAAGTGTT AT GGCCAAGG 
1021 CACAAGCTGA AGT GAGAAAA GTCTTCAAAG AAAATGAAAA CTTGGACGAA AATGATCTTG 
1081 ACAAGTTGCC ATACTTAAAA TCAGTGATCA AAGAAACACT AAGGATGCAT CCTCCAGTTC 
1141 CTTTATTAGG ACCTAGAGAA TGCAGAGAAC AAACT GAGAT T GAT GGATAT ACTGTACCTC 
1201 TTAAAGCTAG AGTAATGGTT AATGCATGGG CAATTGGAAG AGATCCTGAA AGTTGGGAAG 
1261 ATCCTGAAAG TTTCAAACCC GAGCGATTTG AAAATATTTC TGTTGATCTT ACGGGAAATC 
1321 ACT AT CAGTT CATTCCTTTC GGTTCAGGAA GAAGAATGTG TCCAGGAATG TCGTTTGGTT 
1381 TAGTTAACAC TGGGCATCCT TTAGCTCAGT TGCTCTATCT CTTTGACTGG AAATTCCCTC 
1441 ATAAGGTTAA TGCAGCTGAT TTTCACACTA CT GAAACAAG TAGAGTTTTT GCAGCAAGCA 
1501 AAGATGACCT CTACTT GATT CCAACAAATC ACAT GGAGCA AGAGTAGCTC TAAATTGAAT 
1561 TCTTGTCTTG GAACAATAAA AGAAGAAACT CCAGCTTGGT CTACATTATT TCCTTTTGCT 
1621 TTATATTAGT ATGGGTGTGT TCAGTCTCTT GTTTTTAAGG GTACCCTGAA AGATAAAGGG 
1681 CTATATAAAC CAGT GAGACT TTTTATTGGT TGCAAGGTTT TAGATCAAGC CATAAGACAG 
1741 CATATTTTAT TCCACCATTT TCTATCATGT TTAATAAAGT TCCTTTCGTT T ATT GTTAGA 
1801 AAAAAAAAAA AAAAAAAAAA AAA 

SEQ. ID. ^°^°| spFNL IALLLFISFL FILLKKWNTK IPKLPPGPWR LPLIGSLHHL KGKLPHHHLR 
61 DLARKYGPLM YLQLGEVPW VISSPRIAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APYGDYWRQM RKILTQELLS NKMLKSFSTI RKDELSKLLS S I RLATAS S A VNINEKLLWF 
181 TSCMTCRLAF GKICNDRDEL IMLIREILAL SGGFDVCDLF PSWKLLHNMS NMKARLTNVH 
241 HKYNLIMENI INEHKENHAA GIKGNNEFGG EDMIDALLRV KENNELQFPI ENDNMKAVIL 
301 DLFIAGTETS YTAIIWALSE LMKHPSVMAK AQAEVRKVFK ENENLDENDL DKLPYLKSVT 
3 61 KETLRMHPPV PLLGPRECRE QTEIDGYTVP LKARVMVNAW AIGRDPESWE DPESFKPERF 
421 ENI SVDLTGN HYQFIPFGSG RRMCPGMSFG LVNTGHPLAQ LLYLFDWKFP HKYNAAD FHT 
481 TETSRVFAAS KDDLYLIPTN HMEQE 
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NAME D209-AA10 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 209 

1 ATATGCAACT GAGATTTGAA GAATACCAAC TAACCAAAAT GCAGTTCTTC AGCCTGGTTT 
61 CCATTTTCCT ATTTCTATCT TTCCTCTTTT TGTTAAGGGT AT GGAAGAAC TCCAATAGCC 
121 AAAGCAAAAA GTTGCCACCA GGTCCATGGA AACTACCAAT ACTAGGAAGT AT GCTT CATA 
181 TGGTTGGTGG ACTACCACAC CATGTCCTTA GAGATTTAGC CAAAAAATAT GGACCACTTA 
241 TGCACCTTCA ATTAGGT GAA GTTTCTGCGG TTGTGGTTAC TTCTCCTGAT ACGGCAAAAG 
301 AAGTATTAAA AACT CAT GAC ATCGCTTTTG CGTCTAGGCC TAGCCTTTTG GCCCCGGAGA 
361 TTGTCTGTTA CAATAGGTCT GATCTAGCCT TTTGCCCCTA TGGCGACTAT TGGAGACAAA 
421 TGCGTAAAAT ATGTGTCTTG GAAGTGCTCA GTGCCAAGAA TGTTCGGACA TTTAGCTCTA 
481 TTAGGCGGAA TGAAGTTCTT CGTCTCATTA ATTTTATCCG GTCATCTTCT GGT GAACCTA 
541 TTAAT GTTAC GGAAAGGATC TTTTTGTTCA CAAGCTCCAT GAC AT GT AGA TCAGCGTTTG 
601 GGCAAGT GT T CAAAGAGCAA GACAAATTTA TACAACTAAT TAAAGAAGTG ATACTCTTAG 
661 CAGGAGGGTT TGATGTGGCT GACATATTCC CT T C ACT GAA GT TT CTT CAT GTGCTCAGTG 
721 GAATGAAGGG TAAGATTATG AATGCACACC ATAAGGTAGA TGCCATTGTT GAGAATGTCA 
781 TCAATGAGCA CAAGAAAAAT CTTGCAATTG GGAAAACTAA TGGAGCGTTA GGAGGTGAAG 
841 ATTTAATTGA TGTTCTTCTA AGACTTATGA AT GAT GGAG G CCTTCAATTT CCTATCACCA 
901 ACGACAACAT CAAAGCTATA ATTTTTGACA TGTTTGCTGC CGGGACAGAG ACTTCATCGT 
961 CAACAATT GT GTGGGCTATG GTAGAAATGG TGAAAAATCC AGCCGTATTC GC.GAAAGCTC 
1021 AAGCAGAAGT AAGAGAAGCA TTTAGAGGAA AAGAAACTTT CGAT GAAAAT GATGTGGAGG 
1081 AGCTAAACTA CCTAAAGTTA GTAATAAAAG AAACT CTAAG ACTTCATCCA CCGGTTCCAC 
1141 TTTTGCTCCC AAGAGAATGT AGGGAAGAGA CAAATATAAA CGGCTACACT ATTCCTGTAA 
1201 AGACCAAAGT CAT GGT TAAT GTTTGGGCTT TGGGAAGAGA T C C AAAAT AT TGGAATGACG 
1261 CAGAAACTTT TAT GCCAGAG AGATTTGAGC AGTGCTCTAA GGATTTTGTT GGTAATAATT 
1321 TTGAATATCT TCCATTTGGT GGCGGAAGGA GGATTTGTCC TGGGATTTCG TTTGGCTTAG 
1381 CTAAT GCTT A TTTGCCATTG GCTCAATTAC TAT AT C ACT T CGATTGGAAA CTCCCTGCTG 
14 41 GAATCGAACC AAGCGACTTG GACTTGACTG AGTTGGTTGG AGTAACTGCC GCTAGAAAAA 
1501 GTGACCTTTA CTTGGTTGCG ACTCCTTATC AACCTCCTCA AAAGTGATTT AATGGTTTCA 
1561 AGTTTTTATT T CCTAGCAAA CCCCACTATT GTCCTATCTT TCTTTTGGTG TTTTCGGTTT 
1621 TAT CT ACT CT AATACATGCA TCTTTTACCA TAT AG GAAT G T AC CAT GTT G TCG 

SEQ. ID. NO. 210 

1 MQLRFEEYQL TKMQFFSLVS IFLFLSFLFL LRWKNSNSQ SKKLPPGPWK LPILGSMLHM 
61 VGGLPHHVLR DLAKKYGPLM HLQLGEVSAV WTSPDTAKE VLKTHDIAFA SRPSLLAPEI 
121 VCYNRS DLAF CPYGDYWRQM RKICVLEVLS AKNVRTFSSI RRNEVLRLIN FIRSSSGEPI 
181 NVTERIFLFT SSMTCRSAFG QVFKEQDKFI QL IKE VILLA GGFDVADIFP SLKFLHVLSG 
241 MKGKIMNAHH KVDAIVENVI NEHKKNLAIG KTNGALGGED LIDVLLRLMN DGGLQFPITN 
301 DNIKAIIFDM FAAGTETSSS TIVWAMVEMV KNPAVFAKAQ AEVREAFRGK ETFDENDVEE 
361 LNYLKLVIKE TLRLHPPVPL LLPRECREET NINGYTIPVK TKVMVNVWAL GRDPKYWNDA 
421 ETFMPERFEQ CSKDFVGNNF EYLPFGGGRR ICPGISFGLA NAYLPLAQLL YHFDWKLPAG 
481 IEPSDLDLTE LVGVTAARKS DLYLVATPYQ PPQK 
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NAME D209-AA12 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 211 

1 ATAT GCAACT GAGATTT GAA GAATACCAAC TAACCAAAAT GCAGTTCTTC AGCTTGGTTT 
61 CCATTTTCCT ATTTCTATCT TTCCTCTTTT TGTTAAGGAT AT GGAAGAAC TCCAATAGCC 
121 AAAGCAAAAA GTTGCCACCA GGTCCATGGA AACT AC CAAT ACTAGGAAGT AT GCTT CATA 
181 TGGTTGGTGG ACTACCACAC CATGTCCTTA GAGATTTAGC CAAAAAATAT GGACCACTTA 
241 TGCACCTTCA ATTAGGT GAA GTTTCTGCGG TTGTGGTTAC TTCTCCTGAT ACGGCAAAAG 
301 AAGTATTAAA AACT CAT GAC ATCGCTTTTG CGTCTAGGCC TAGCCTTTTG GCCCCGGAGA 
361 TTGTCTGTTA CAATAGGTCT GATCTAGCCT TTTGCCCCTA TGGCGACTAT TGGAGACAAA 
421 TGCGTAAAAT ATGTGTCTTG GAAGTGCTCA GT GCCAAGAA TGTTCGGACA TTTAGCTCTA 
481 TTAGGCGGAA TGAAGTTCTT CGTCTCATTA ATTTTATCCG GTCATCTTCT GGTGAACCTA 
541 TTAAT GTTAC GGAAAGGATC TTTTTGTTCA CAAGCTCCAT GAC AT GT AGA TCAGCGTTTG 
601 GGCAAGTGTT CAAAGAGCAA GACAAATTTA TACAACTAAT TAAAGAAGTG ATACTCTTAG 
661 CAGGAGGGTT TGATGTGGCT GACATATTCC CTTCACTGAA GTTTCTTCAT GTGCTCAGTG 
721 GAATGAAGGG TAAGATTATG AAT GCACACC ATAAGGTAGA TGCCATTGTT GAGAATGTCA 
781 T CAAT GAG C A CAAGAAAAAT CTTGCAATTG GGAAAACTAA TGGAGCGTTA GGAGGT GAAG 
841 ATTTAATTGA TGTTCTTCTA AGACTTATGA AT GAT GGAGG CCTT CAATTT CCTATCACCA 
901 AC GAC AAC AT CAAAGCCATA ATTTTTGACA TGTTTGCTGC CGGGACAGAG ACTTCATCGT 
961 CAACAATTGT GTGGGCTATG GTAGAAATGG TGAAAAATCC AGCCGTATTC GCGAAAGCTC 
1021 AAGCAGAAGT AAGAGAAGCA TTTAGAGGAA AAGAAACTTT CGATGAAAAT GATGTGGAGG 
1081 AGCTAAACTA CCTAAAGTTA GTAATAAAAG AAACTCTAAG ACTTCATCCA CCGGTTCCAC 
1141 TTTTGCTCCC AAGAGAAT GT AGGGAAGAGA CAAATATAAA CGGCTACACT AT T C CT GT AA 
1201 AGACCAAAGT CAT G GT T AAT GTTTGGGCTT T G G GAAG AGA T C CAAAAT AT T GGAAT GAC G 
1261 CAGAAACTTT TAT GCCAGAG AGATTTGAGC AGTGCTCTAA GGATTTTGTT GGTAATAATT 
1321 TTGAATATCT TCCATTTGGT GGCGGAAGGA GGATTTGTCC TGGGATTTCG TTTGGCTTAG 
1381 CTAATGCTTA TTTGCCATTG GCTCAATTAC TAT AT CACT T CGATTGGAAA CTCCCTGCTG 
1441 GAATCGAACC AAGCGACTTG GACTTGACTG AGTTGGTTGG AGTAACTGCC GCTAGAAAAA 
1501 GT GACCTTTA CTTGGTT GCG ACT CCTT AT C AACCTCCTCA AAAGTGATTT AATGGTTTCA 
1561 AGTTTTTATT TCCTAGCAAA CCCCACTATT GTCCTATCTT TCTTTTGGTG TTTTCGGTTT 
1621 TAT CTACT CT AATACAT GCA TCTTTTACCA TATAGGAATG TAG CAT GTT G TCG 

SEQ. ID. NO. 212 

1 MQLRFEEYQL TKMQFFSLVS IFLFLSFLFL LRIWKNSNSQ SKKLPPGPWK LPILGSMLHM 

61 VGGLPHHVLR DLAKKYGPLM HLQLGEVSAV WTSPDTAKE VLKTHDIAFA SRPSLLAPEI 

121 VCYNRS DLAF CPYGDYWRQM RKICVLEVLS AKNVRTFSSI RRNEVLRLIN FIRSSSGEPI 

181 NVTERI FLFT SSMTCRSAFG QVFKEQDKFI QLIKEVILLA GGFDVADIFP SLKFLHVLSG 

241 MKGKIMNAHH KVDAIVENVI NEHKKNLAIG KTNGALGGED LIDVLLRLMN DGGLQFPITN 
301 DNIKAIIFDM FAAGTETSSS TIVWAMVEMV KNPAVFAKAQ AEVREAFRGK ETFDENDVEE 

361 LNYLKLVIKE TLRLHPPVPL LLPRECREET NINGYTIPVK TKVMVNVWAL GRDPKYWNDA 

421 ETFMPERFEQ CSKDFVGNNF EYLPFGGGRR ICPGISFGLA NAYLPLAQLL YHFDWKLPAG 

481 IEPSDLDLTE LVGVTAARKS DLYLVATPYQ PPQK 
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NAME D2 09-AH10 

ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 213 

1 ATAT GCAACT GAGATTT GAA GAATACCAAC TAACCAAAGT GCAGTTCTTC AGCTTGGTTT 
61 CCATTTTCCT ATTTCTATCT TTCCTCTTTT T GTTAAGGAT AT GGAAGAAC TCCAATAGCC 
121 AAAGCAAAAA GTTGCCACCA GGTCCATGGA AACTAC CAAT ACTAGGAAGT AT GCTT CAT A 
181 TGGTTGGTGG ACTACCACAC CAT GT CCTTA GAGATTTAGC CAAAAAATAT GGACCACTTA 
241 TGCACCTTCA ATTAGGTGAA GTTTCTGCGG TTGTGGTTAC TTCTCCTGAT ACGGCAAAAG 
301 AAGTATTAAA AACT CAT GAC ATCGCTTTTG CGTCTAGGCC TAGCCTTTTG GCCCCGGAGA 
361 TTGTCTGTTA CAATAGGTCT GAT CTAGCCT TTTGCCCCTA TGGCGACTAT TGGAGACAAA 
421 TGCGTAAAAT ATGTGTCTTG GAAGTGCTCA GT GCCAAGAA TGTTCGGACA TTTAGCTCTA 
481 TTAGGCGGAA TGAAGTTCTT CGTCTCATTA ATTTTATCCG GTCATCTTCT GGTGAACCTA 
541 TTAAT GTTAC GGAAAGGATC TTTTTGTTCA CAAGCTCCAT GACATGTAGA TCAGCGTTTG 
601 GGCAAGTGTT CAAAGAGCAA GACAAATTTA TACAACTAAT TAAAGAAGTG ATACTCTTAG 
661 CAGGAGGGTT TGATGTGGCT GAC AT ATT C C CTTCACTGAA GTTTCTTCAT GTGCTCAGTG 
721 GAAT GAAGGG TAAGATTATG AAT GCACACC ATAAGGTAGA TGCCATTGTT GAGAATGTCA 
781 T CAAT GAGCA CAAGAAAAAT CTTGCAATTG GGAAAACTAA TGGAGCGTTA GGAGGT GAAG 
841 ATTTAATTGA TGTTCCTCTA AGACTTATGA AT GATGGAGG CCTTCAATTT CCTATCACCA 
901 ACGACAACAT CAAAGCTATA ATTTTTGACA TGTTTGCTGC CGGGACGGAG ACT T CAT C GT 
961 CAACAATT GT GTGGGCTATG GTAGAAATGG TGAAAAATCC AGCCGTATTC GCGAAAGCTC 
1021 AAGCAGAAGT AAGAGAAGCA TTTAGAGGAA AAGAAACTTT CGAT GAAAAT GAT GT GGAGG 
1081 AGCTAAACTA CCTAAAGTTA GTAATAAAAG AAACT CTAAG ACTTCATCCA CCGGTTCCAC 
1141 TTTTGCTCCC AAGAGAAT GT AGGGAAGAGA CAAATATAAA CGGCTACACT ATTCCTGTAA 
1201 AGAC CAAAGT CATGGTTAAT GTTTGGGCTT TGGGAAGAGA TCCAAAATAT TGGAATGACG 
1261 CAGAAACTTT TAT GCCAGAG AGATTTGAGC AGTGCT CTAA GGATTTTGTT GGTAATAATT 
1321 T T GAAT AT C T TCCATTTGGT GGCGGAAGGA GGATTTGTCC TGGGATTTCG TTTGGCTTAG 
1381 CTAATGCTTA TTTGCCATTG GCT CAATT AC TAT AT CACTT CGATT GGAAA CTCCCTGCTG 
1441 GAATCGAACC AAGCGACTTG GACTTGACTG AGTTGGTTGG AGTAACTGCC GCTAGAAAAA 
1501 GTGACCTTTA CTTGGTTGCG ACTCCTTATC AACCTCCTCA AAAGTGATTT AATGGTTTCA 
1561 AGTTTTTATT TCCTAGCAAA CCCCACTATT GTCCTATCTT TCTTTTGGTG TTTTCGGTTT 
1621 TATCTACTCT AATACAT GCA TCTTTTACCA TATAGGAATG T AC CAT GT T G TCG 

SEQ. ID. NO. 214 

1 MQLRFEEYQL TKVQFFSLVS IFLFLSFLFL LRIWKNSNSQ SKKLPFGPWK LPILGSMLHM 
61 VGGLPHHVLR DLAKKYGPLM HLQLGEVSAV WTSPDTAKE VLKTHDIAFA SRPSLLAPEI 
121 VCYNRSDLAF CPYGDYWRQM RKICVLEVLS AKNVRTFSSI RRNEVLRLIN FIRSSSGEPI 
181 NVTERI FLFT SSMTCRSAFG QVFKEQDKFI QLIKEVILLA GGFDVADIFP SLKFLHVLSG 
241 MKGKIMNAHH KVDAIVENVI NEHKKNLAIG KTNGALGGED LIDVPLRLMN DGGLQFPITN 
301 DNIKAIIFDM FAAGTETSSS TIWAMVEMV KNPAVFAKAQ AEVREAFRGK ETFDENDVEE 
361 LNYLKLVIKE TLRLHPPVPL LLPRECREET NINGYTIPVK TKVMVNVWAL GRDPKYWNDA 
421 ETFMPERFEQ CSKDFVGNNF EYLPFGGGRR ICPGISFGLA NAYL P LAQ L L YHFDWKLPAG 
4 81 IEPSDLDLTE LVGVTAARKS D L YL VAT P YQ PPQK 
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NAME D87A-AF3 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. ^0^215^^^ ATGCTCACAA CAGCAAAAT T GCAGCAATCT GTTTGATAAT TTTCTTGGTA 
61 TATAAAGCAT GGGAATTGTT GAAGTGGATA TGGATTAAGC CAAAGAAACT GGAGAGTTGC 
121 CTCAGAAAAC AGGGACTCAA AGGAAATTCC TACAGGCTAT TCTATGGAGA TAT GAAAGAA 
181 TTGTCCAAAA GTCTCAAGGA AATCAATTCA AAGCCCATCA TCAATCTATC AAATGAAGTA 
241 GCCCCAAGAA TCATTCCTTA TTATCTTGAA ATCATCCAAA AATATGGTAA AAGAT GTTTT 
301 GTTTGGCAAG GACCAACCCC CGCAATATTA ATAACAGAGC CAGAATTAAT AAAGGAGATA 
361 TTT GGTAAGA ACTATGTTTT . TCAGAAGCCT AATAATCCCA ACCCACTGAC CAAGTTATTG 
421 GCTCGAGGTG TTGTAAGCTA C GAGGAAGAA AAAT GGGCAA AACACAGAAA GATCTTAAAC 
481 CCTGCCTTTC ATATGGAGAA GTTGAAGCAT ATGCTACCAG CATTTTACTT GAGCTGTAGT 
541 GAGATGCTGA ACAAATGGGA GGAGATTATC CCAGTAAAAG AATCAAAT GA GTTGGACATT 
601 TGGCCTCATC TTCAAAGAAT GACAAGTGAT GTGATTTCTC GTGCTGCCTT TGGTAGTAGC 
661 TACGAAGAAG GAAGAAGAAT ATTTGAACTT CAAGAAGAAC AAGCT GAGTA TCTAACGAAG 
721 ACATTCAATT CAGTTTATAT CCCAGGTTCC AGATTTTTTC CCAATAAAAT GAACAAAAGA 
781 AT GAAAGAAT GTGAAAAGGA AGT AC GAGAA ACAATTACGT GTCTAATTGA CAACAGATTA 
841 AAGGCAAAAG AAGAAGGCAA TGGCAAGGCC CTCAATGATG ACCTACTGGG TATATTATTA 
901 GAGTCAAATT CTATAGAAAT T GAAGAACAT GGTAACAAGA AGTTT GGAAT GAGTATACCT 
961 GAAGTAATTG AAGAGTGCAA ATTATTCTAT TTTGCTGGCC AAGAGACTAC AT CAGTATT G 
1021 CTTGTGTGGA CACTGATTTT GTTAGGGAGA AATCCAGAAT GGCAGGAACG TGCTAGAGAG 
1081 GAAGTTTTTC AAGCCTTTGG AAGT GAT AAA CCAACTTTTG ACGAATTATA TCGCTTGAAA 
1141 ATTGTGACGA TGATTTTGTA CGAGT CTTTA AGGTTATAT C CACCAATAGC AACT CGTACT 
1201 CGAAGGACTA ATGAAGAAAC AAAATTAGGG GAACTAGATT TACCAAAGGG TGCACTGCTC 
1261 TTTATACCAA CAATCTTATT ACATCTTGAC AAGGAAATTT GGGGTGAAGA T GCAGAT GAG 
1321 TTCAATCCGG AGAGATTTAG CGAAGGGGTG GCAAAGGCAA CAAAGGGGAA AATGACATAT 
1381 TTTCCATTTG GTGCAGGACC GCGAAAATGC ATTGGGCAAA ACTTCGCGAT TTTGGAAGCA 
1441 AAAATGGCTA TAGCTATGAT TCTACAACGC TTCTCCTTCG AGCTCTCTCC ATCTTATACA 
1501 CACTCTCCAT ACACT GTGGT CACTTTGAAA CCCAAATATG GTGCTCCCCT AATAATGCAC 
1561 AGGCTGTAGT CCTGTGAGAA 

SEQ. ID, 1 N ° ( : N ^ SKIA AICLlrFLvy kawelLKWIW IKPKKLESCL RKQGLKGNSY RLFYGDMKEL 
61 SKSLKEINSK PIINLSNEVA PRIIPYYLEI IQKYGKRCFV WQGPTPAILI TEPELIKEIF 
121 GKNYVFQKPN NPNPLTKLLA RGWSYEEEK WAKHRKILNP AFHMEKLKHM LPAFYLSCSE 
181 MLNKWEEIIP VKESNELDIW PHLQRMTSDV ISRAAFGSSY EEGRRI FELQ EEQAEYLTKT 
241 FNSVYIPGSR FFPNKMNKRM KECEKEVRET ITCLIDNRLK AKEEGNGKAL NDDLLGILLE 
301 SNSIEIEEHG NKKFGMSIPE VIEECKLFYF AGQETT SVLL VWTLILLGRN PEWQERAREE 
361 VFQAFGSDKP TFDELYRLKI VTMILYESLR LYPPIATRTR RTNEETKLGE LDLPKGALLF 
421 IPTILLHLDK EIWGEDADEF NPERFSEGVA KATKGKMTYF PFGAGPRKCI GQNFAI LEAK 
481 MAI AMI LQRF SFELSPSYTH SPYTWTLKP KYGAPLIMHR L 
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NAME D208-AC8 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID- NO. 217 

1 ATGCTTTCTC CCATAGAAGC CTTT GTAGGA CTAGTAACCT TCACATTTCT CTTATACTTC 
61 CTAT GGACAA AAAAAT CT CA AAAACTTCCA AAACCCTTAC CACCGAAAAT CCCCGGAGGA 
121 TGGCCGGTAA TCGGCCATCT TTTTCACTTC AATAACGACG GCGACGACCG TCCATTAGCT 
181 CGAAAGCTCG GAGACTTAGC T GAT AAAT AC GGCCCCGTTT TCACTTTTCG GCTAGGTCTT - 
241 CCCCTTGTGC TAGTT GTAAG CAGTTACGAA GCTATAAAAG ATTGCTTCTC T ACAAAT GAT 

3 01 GCCATTTTCT CCAATCGTCC AGCTCTTCTT TACGGCGAAT ACCTTGGCTA CAATAATACA 
361 ATGCTTTTTC TAGCAAATTA CGGACCTTAC TGGCGAAAAA AT CGTAAAT T AGTCATTCAG 
421 GAAGTTCTCT CTGCTAGTCG TCTCGAAAAA TT CAAACAAG TGAGATTCAC CAGAATT CAA 

4 81 ACGAGCATTA AGAATTTATA CACTCGAATT AATGGAAATT CGAGTACGAT AAAT CTAACT 
541 GATTGGTTAG AAGAATTGAA TTTTGGTCTG AT CGT GAAAA TGATCGCTGG GAAAAATTAT 
601 GAATCCGGTA AAGGAGAT GA ACAAGT GGAA AGATTTAAGA ATGCGTTTAA GGATTTTATG 
661 GTTTTATCAA TGGAATTTGT ATTAT GGGAT GCATTTCCAA TTCCATTATT TAAATGGGTG 
721 GATTTT CAAG GT CAT ATTAA GGCAATGAAA AGGACATTTA AGGATATAGA TTCTGTTTTT 
781 CAGAACT GGT TAGAGGAACA TAT T AAT AAA AGAGAAAAAA TAGAGGTTGG TGCAGAAGGG 
841 AATGAACAAG ATTTCATTGA TGTGGTGCTT TCAAAATTGA GTAAAGAATA T CT T GAT G AA 
901 GGTTACT CT C GT GAT ACT GT CATTAAAGCA ACAGTTTTTA GTTTGGTCTT GGATGCAGCA 
961 GACACAGTT G CTCTTCACAT AAATTGGGGA AT GAC AT TAT TGATAAACAA T CAAAAT GCC 

1021 TTGATGAAAG CACAAGAAGA GATAGACACA AAAGTT GGTA AGGATAGAT G GGTAGAAGAG 
1081 AGT GATATTA AGGATT TAGT ATACCT CCAA GCTATTGTTA AAAAGGT GTT ACGAT TATAT 
1141 CCACCAGGAC CTTTGTTAGT ACCACAT GAA AAT GTAAAGG ATTGTGTTGT TAGT GGATAT 
1201 CACATTCCTA AAGGGACTAG ATTATTCGCA AAC GT CAT GA AACTGCAGCG CGATCCTAAA 
12 61 CTCTTGTCAA ATCCTGATAA GTTCGATCCA GAGAGATTCA TCGCTGGTGA TATTGACTTC 
1321 CGTGGTCACC ACTAT GAGTT TATCCCATTT GGTT CT GGAA GACGATCTTG TCCGGGGATG 
1381 ACT TAT G CAT TGCAAGTGGA AC AC CT AAC A AT GGCACATT TAATCCAGGG TTTCAATTAC 
1441 AAAACTCCAA AT GAC GAGGC CTT GGATAT G AAGGAAGGTG CAGGCATAAC AAT AC GTAAG 
1501 GTAAAT CCAG TGGAATTGAT AATAACGCCT CGCTTGGCAC CTGAGCTTTA CTAAAACCTA 
1561 AGATGTTTCA TCTTGGTTGA TCATTGT 

SEQ. ID. NO. 218 

1 MLSPIEAFVG LVTFTFLLYF LWTKKSQKLP KPLPPKIPGG WPVIGHLFHF NNDGDDRPLA 
61 RKLGDIADKY GPVFTFRLGL PLVLWSSYE AIKDCFSTND AI FSNRPALL YGEYLGYNNT 
121 MLFLANYGPY WRKNRKLVIQ EVLSASRLEK FKQVRFTRIQ TSIKNLYTRI NGNSSTINLT 
181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFKNAFKDFM VLSMEFVLWD AFPIPLFKWV 
241 DFQGHI KAMK RTFKDIDSVF QNWLEEHINK REKIEVGAEG NEQDFIDWL SKLSKEYLDE 
301 GYSRDTVIKA TVFSLVLDAA DTVALHINWG MTLLINNQNA LMKAQEEIDT KVGKDRWVEE 
361 SDIKDLVYLQ AIVKKVXRLY PPGPLLVPHE NVKDCWS GY HIPKGTRLFA NVMKLQRDPK 
421 LLSNPDKFDP ERFIAGDIDF RGHHYEFIPF GSGRRSCPGM TYALQVEHLT MAHLIQGFNY 
481 KTPNDEALDM KEGAGITIRK VNPVELIITP RLAPELY 
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NAME D215-AB5 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 219 

1 GGGAGAAGGC CTTCAATATG GAGATACCAT ATTACAGCTT AAAAATT GC A ATTTCTTCAT 
61 T T GCAATT AT CTTTGTACTA AGAT GGGCAT GGAAAATCTT GAATTATGTG TGGTTAAAAC 
121 CAAAAGAATT GGAGAAATAC CTCAGACAGC AGGGTTT CAA AGGAAACTCT TACAAATTCT 
181 TGTTTGGGGA TAT GAAAGAG ACGAAGAAAA TGGGTGAAGA AGCTATGTCT AAGCCAATCA 
241 ATTTCTCTCA TGACATGATT TGGCCTAGAG TTATGCCATT CAT CC AC AAA ACCATCACCA 
301 ATTAT GGTAA GAATTGTATT GTGTGGTTTG GGC CAAGACC AGCAGTCCTG AT CACAGACC 

3 61 CGGAACTTGT AAAGGAGGTG CTAACGAAGA ATTTCGTCTA TCAGAAGCCG CTTGGCAATC 
421 C ACT C AC AAA GTTGGCAGCA ACT GGAAT T G C AG G C TAT GA AACAGATAAA TGGGCTACAC 

4 81 ATAGAAGGCT TCTCAATCCT GCTTTTCACC TTGACAAGTT GAAGCATATG CTACCT GCAT 
541 TCCAATTTAC TGCTAGTGAG AT GTT GAGCA AATTGGAGAA AGTTGTTTCA CCAAACGGAA 
601 CAGAGATAGA TGTGTGGCCA TATTTACAAA CTTTGACAAG T GAT GCCATT TCAAGAACTG 
661 CGTTTGGAAG T AGT TAT GAA GAAGGAAGAA AGATTTTTGA CCTTCAAAAA GAACAACTTT 
721 CACTAATTCT AGAAGTTTCA CGCACAATAT ATATTCCAGG ATGGAGGTTT TTGCCAACGA 

7 81 AAAGGAACAA AAGGAT GAAG CAAATATTTA AT GAAGT ACG AGCACT GGTA TTTGGAATTA 

8 41 TTAAGAAAAG GAT GAGT AT G AT T GAAAAT G GAGAAGCACC T GAT GATTT A TT G GGAAT AT 
901 TATTGGCATC CAATTTAAAA GAAAT CCAAC AACAT GGAAA CAACAAGAAA TTTGGTATGA 

9 61 GT AT T GAT GA GGT GATTGAA GAGT GT AAAC TCTTCTATTT TGCTGGGCAA GAGACTACTT 
1021 CAT CTTTACT TGTATGGACT AT GATT TT GT TGTGCAAATA TCCTAATTGG CAAGATAAAG 
10 81 CTAGAGAAGA GGTTTTGCAA GTGTTTGGGA GTAGGGAAGT TGACTATGAC AAGTTGAATC 
1141 AGCTAAAAAT AGT AACTAT G AT CTTAAAC G AGGT CTTAAG GTT GT AT C CA GCAGGATATG 
12 01 TGATTAATCG AAT GGTAAAC AAAGAAACAA AGTTAGGGAA TTTGTGTTTA CCAGCCGGCG 

12 61 TACAGCTCGT GTTAC CAACA ATGTTGTTGC AACAT GAT AC T GAAAT AT GG GGAGAT GAT G 
1321 CAAT GGAGT T CAATCCAGAG AGATTTAGTG AT GGAAT AT C CAAAGCAACA AAAGGAAAAC 

13 81 TTGTGTTTTT TCCATTTAGT TGGGGTCCAA GAAT AT GT AT TGGGCAAAAT TTTGCTATGT 
1441 TAGAGGCTAA AATGGCAATG GCTAT GATT C TGAAAACCTA TGCATTTGAA CTCTCTCCAT 
1501 CTTATGCTCA TGCTCCTCAT CCACTACTAC TTCAACCTCA ATATGGTGCT CAATTAATTT 
15 61 TGTACAAGTT GTAGATATGG TCAATCTGGA ACTTGTTATG GAACTTTTAT CAT C GTAAT C 
1621 AAC CAT AT T G AGGG 

SEQ . ID. NO. 220 

1 MEIPYYSLKI AISSFAIIFV LRWAWKILNY VWLKPKELEK YLRQQGFKGN SYKFLFGDMK 
61 ETKKMGEEAM SKPINFSHDM IWPRVMPFIH KTITNYGKNC IVWFGPRPAV LITDPELVKE 
121 VLTKNFVYQK PLGNPLTKLA ATGIAGYETD KWATHRRLLN PAFHLDKLKH ML PAFQ FT AS 
181 EMLSKLEKW SPNGTEIDVW PYLQTLTSDA ISRTAFGSSY EEGRKIFDLQ KEQLSLILEV 
241 SRTIYIPGWR FLPTKRNKRM KQIFNEVRAL VFGIIKKRMS MIENGEAPDD LLGI LLASNL 
301 KEIQQHGNNK KFGMSIDEVI EECKLFYFAG QETTSSLLVW TMI LLCKYPN WQDKAREEVL 
361 QVFGSREVDY DKLNQLKIVT MILNEVLRLY PAGYVINRMV NKETKLGNLC LP AGVQLVL P 
421 TMLLQHDTEI WGDDAMEFNP ERFSDGISKA TKGKLVFFPF SWGPRICIGQ NFAMLEAKMA 
4 81 MAMILKTYAF ELSPSYAHAP HPLLLQPQYG AQLILYKL 
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NAME D103-AH3 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 221 

1 ATGGTTTTTC CCATAGAAGC CTTTGTAGGA CTAGTAACCT TCACATTTCT CTTATACTTC 
61 CTAT GGACAA AAAAATCTCA AAAACTTCCA AAACCCTTAC CACCGAAAAT CCCCGGAGGA 
121 TGGCCGGTAA TCGGCCACCT TTTTCACTTC AATAACGACG GCGACGACCG TCCATTAGCT 
181 CGAAAACTCG GAGACTTAGC T GAT AAAT AC GGCCCCGTTT TCACTTTTCG GCTAGGTCTT 
241 CCCCTTGTGC TAGTT GTAAG CAGTTACGAA GCTACAAAAG ATTGCTTCTC TACAAATGAC 
301 GCCATTTTCT CCAATCGTCC AGCTTTTCTT TACGGC GAAT ACCTTGGCTA CAATAATACA 
361 ATGCTTTTTC TAGCAAATTA CGGACCTTAC TGGCGAAAAA AT CGTAAATT AGTCATTCAG 
421 GAAGTTCTCT CTGCTAGTCG TCTCGAAAAA TTCAAACAAG TGAGATTCAC CAGAATTCAA 
481 AC GAGCATTA AGAATTTATA CACTCGAATT AAT GGAAATT CGAGTACGAT AAAT CTAACT 
541 GATT GGTT AG AAGAATT GAA TTTTGGTCTG AT CGT GAAAA TGATCGCTGG GAAAAATTAT 
601 GAATCCGGTA AAGGAGATGA ACAAGT GGAA AGATTTAAGA ATGCGTTTAA GGATTTTATG 
661 GTTTTATCAA TGGAATTTGT ATTAT GGGAT GCATTT CCAA TT C CAT TAT T TAAATGGGTG 
721 GATTTTCAAG GTCATATTAA G AC AAT GAAA AGGACATTTA AGGATATAGA TTCTGTTTTT 
781 CAGAACT GGT TAGAGGAACA TATTAATAAA AGAGAAAAAA TGGAGGTTGG TGCAGAAGGG 
841 AATGAACAAG ATTTCATTGA TGTGGTGCTT TCAAAATTGA GTAAAGAATA TCTTGATGAA 
901 GGTTACTCTC GT GAT ACT GT CATTAAAGCA ACAGTTT TTA GTTTGGTCTT GGAT GCAGCA 
961 GACACAGTTG CTCTTCACAT AAATTGGGGA AT GACATTAT TGATAAACAA TCAAAATGCC 
1021 TTGATGAAAG CACAAGAAGA GATAGACACA AAAGTTGGTA AGGATAGATG GGTAGAAGAG 
1081 AGT GATATT A AGGATTTAGT ATACCT CCAA GCTATTGTTA AAAAGGTGTT ACGATTATAT 
1141 CCACCAGGAC CTTTGTTAGT AC CACATGAA AAT GTAAAGG ATTGTGTTGT T AGT GGAT AT 

12 01 CACATTCCTA AAGGGACTAG AT TATT CGCA AACGT CAT GA AACT GCAGCG CGATCCTAAA 
1261 CTCTTGTCAA ATCCTGATAA GTTCGATCCA GAGAGATTCA TCGCTGGTGA TATT GACTTC 
1321 CGTGGTCACC ACTATGAGTT TATCCCATCT GGTTCTGGAA GAC GAT CTT G TCCGGGGATG 

13 81 ACT TAT G CAT TGCAAGTGGA ACACCTAACA AT G G C AC AT T T AAT C C AG G G T T T C AAT T AC 
1441 AAAACT CCAA AT GACGAGGT CTT GGATAT G AAGGAAGGT G CAGGCATAAC AAT AC GTAAG 
1501 GTAAAT CCAG TGGAATTGAT AAT AAC GC CT CGCTTGGCAC CTGAGCTTTA CTAAAACCTA 
15 61 AGATCTTTCA T CTT GGTT GA TCATTGTTTA ATA 

SEQ. ID. NO. 222 

1 MVFPIEAFVG LVTFTFLLYF LWTKKSQKLP KPLPPKIPGG WPVIGHLFHF NNDGDDRPLA 
61 RKLGDLADKY GPVFTFRLGL PLVLWSSYE ATKDCFSTND AIFSNRPAFL YGEYLGYNNT 
121 MLFLANYGPY WRKNRKLVIQ EVLSASRLEK FKQVRFTRIQ TSIKNLYTRI NGNSSTINLT 
181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFKNAFKDFM VLSMEFVLWD AFPIPLFKWV 
241 DFQGHIKTMK RTFKDIDSVF QNWLEEHINK RE KMEVGAE G NEQDFIDWL SKLSKEYLDE 
301 GYSRDTVIKA TVFSLVLDAA DTVALHINWG MTLLINNQNA LMKAQEEIDT KVGKDRWVEE 

3 61 SDIKDLVYLQ AIVKKVLRLY PPGPLLVPHE NVKDCWSGY HIPKGTRLFA NVMKLQRDPK 
421 LLSNPDKFDP ERFIAGDIDF RGHHYEFIPS GSGRRSCPGM TYALQVEHLT MAHLIQGFNY 

4 81 KTPNDEVLDM KEGAGITIRK VNPVELIITP RLAPELY 
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NAME D2 08-AD9 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 223 

1 ATGCTTTCTC CCATAGAAGC CATTGTAGGA CTAGTAACCT TCACATTTCT CTTCTTCTTC 
61 CTAT GGACAA AAAAATCTCA AAAACCTTCA AAACCCTTAC CAC CGAAAAT CCCCGGAGGA 
121 TGGCCGGTAA TCGGCCATCT TTTCCACTTC AATGACGACG GCGACGACCG TCCATTAGCT 
181 CGAAAACTCG GAGACTTAGC T GACAAAT AC GGCCCCGTTT TCACTTTTCG GCTAGGCCTT 
241 CCCCTTGTCT TAGTTGTAAG CAGTTAC GAA GCTGTAAAAG ACTGTTTCTC CACAAATGAC 
301 GCCATTTTTT CCAATCGTCC AGCTTTTCTT TACGGCGATT ACCTTGGCTA CAATAATGCC 
361 ATGCTATTTT TGGCCAATTA CGGACCTTAC TGGCGAAAAA AT CGAAAATT AGTTATTCAG 
421 GAAGTTCTCT CCGCTAGTCG TCTCGAAAAA TTCAAACACG TGAGATTTGC AAGAATTCAA 
481 GCGAGCATGA AGAATTTATA TACTCGAATT GAT GGAAATT CGAGTACGAT AAATTTAACT 
541 GATT GGT TAG AAGAATT GAA TTTTGGTCTG AT CGT GAAGA TGATCGCTGG AAAAAATTAT 
601 GAATCCGGTA AAGGAGATGA ACAAGT GGAG AGATTTAAGA AAGCGTTTAA GGATTTTATG 
661 ATTTTATCAA TGGAGTTTGT GTTAT GGGAT GCATTTCCAA TTCCATTATT TAAATGGGTG 
721 GATTTT CAAG GGCATGTTAA GGCTATGAAA AGGACTTTTA AAGATATAGA TTCTGTTTTT 
7 81 C AGAATT GGT TAGAGGAACA TATTAATAAA AGAGAAAAAA TGGAGGTTAA TGCAGAAGGG 
841 AATGAACAAG ATTTCATTGA TGTGGTGCTT TCAAAAATGA GTAATGAATA TCTTGGTGAA 
901 GGTTACTCTC GT GAT ACT GT CATT GAAGCA ACGGTGTTTA GTTTGGTCTT GGATGCAGCA 
961 GACACAGTTG CTCTTCACAT AAATT GGGGA AT GGCATTAT T G AT AAACAA TCAAAAGGCC 
1021 TTGACGAAAG CACAAGAAGA GATAGACACA AAAGTTTGTA AGGACAGATG GGTAGAAGAG 
1081 AGT GATATTA AGGATTT GGT AT AC CT C GAA GCTATT GTTA AAGAAGT GTT ACGATTATAT 
1141 CCACCAGGAC CTTTGTTAGT ACCACACGAA AATGTAGAAG ATTGTGTTGT TAGT GGATAT 
1201 CACATTCCTA AAGGGACAAG ATTATTCGCA AACGT CAT GA AACT GCAACG T GAT CCTAAA 
1261 CTCTGGTCTG AT C CT GAT AC TTTCGATCCA GAGAGATTCA TTGCTACT GA TATT GACTTT 
1321 CGTGGTCAGT ACTATAAGTA TATCCCGTTT GGTCCTGGAA GACGATCTTG TCCAGGGATG 
1381 ACTTATGCAT TGCAAGTGGA ACACTTAACA AT G G CAC AT T T GAT C CAAG G TTTCAATTAC 
1441 AGAACTCCAA ATGACGAGCC CT T GGATAT G AAGGAAGGTG CAGGCATAAC TATACGTAAG 
1501 GTAAATCCTG TGGAACTGAT AATAGCGCCT CGCCTGGCAC CTGAGCTTTA TTAAAACCTA 
1561 AGATGTTTCA TCTTGGTTGA 

SEQ. ID. NO. 224 

1 MLSPIEAIVG LVTFTFLFFF LWTKKSQKPS KPLPPKIPGG WPVIGHLFHF NDDGDDRPLA 

61 RKLGDLADKY GPVFTFRLGL PLVLWSSYE AVKDCFSTND AIFSNRPAFL YGDYLGYNNA 

121 MLFLANYGPY WRKNRKLVIQ EVLSAS RLEK FKHVRFARIQ ASMKNLYTRI DGNSSTINLT 

181 DWLEELNFGL IVKMIAGKNY ESGKGDEQVE RFKKAFKDFM ILSMEFVLWD AFPIPLFKWV 

'241 DFQGHVKAMK RTFKDIDSVF QNWLEEHINK REKMEVNAEG NEQDFIDWL SKMSNEYLGE 

301 GYSRDTVIEA TVFSLVLDAA DTVALHINWG MALLINNQKA LTKAQEEIDT KVCKDRWVEE 

3 61 SDIKDLVYLQ AIVKEVLRLY PPGPLLVPHE NVEDCWSGY HIPKGTRLFA NVMKLQRDPK 
421 LWSDPDTFDP ERFIATDIDF RGQYYKYIPF GPGRRSCPGM TYALQVEHLT MAHLIQGFNY 

4 81 RTPNDEPLDM KEGAGITIRK VNPVELIIAP RLAPELY 
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NAME D237-AD1 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 225 

1 TTT CATATAC CTTTAGTACT CTTGAAATTT TCAAATAATG GTTTATCTTC TTTCTCCCAT 
61 AGAAGCCATT GTAGGATTTG TAACCTTTTC ATTTCTATTC TACTTTCTAT GGACCAAAAA 
121 ACAATCAAAA ATCTTAAACC CACTACCTCC AAAAAT C CCA GGTGGATGGC CAGTAATCGG 
181 CCATCTCTTT TATTT CAAGA ACAATGGCGA TGAAGATCGC CATTTTTCTC AAAAACTCGG 
241 TGACTTAGCT G ACAAAT AT G GTCCCGTCTT CACTTTCCGG TTAGGGTTTC GCCGTTTCTT 
301 GGCGGTGAGT AGTTAT GAAG CTAT GAAAGA ATGCTTCACT ACCAATGATA TCCATTTCGC 
361 CGATCGGCCA TCTTTACTCT ACGGAGAATA CCTTTGCTAT AATAACGCCA TGCTTGCTGT 
421 TGCCAAATAT GGCCCTTACT GGAAAAAAAA TCGAAAGTTA GTCAATCAAG AAGTTCTCTC 
481 CGTTAGTCGG CTCGAAAAAT TCAAACATGT TAGATTTTCT ATAATTCAGA AAAAT AT T AA 
541 ACAATT GTAT AATTGTGATT CACCAATGGT GAAGAT AAAC CTTAGTGATT GGATAGATAA 
601 ATT GACATT C GACATCATTT TGAAAAT GGT TGTTGGGAAG AACTATAATA AT GGACAT GG 
661 AGAAAT ACT C AAAGTTGCTT TT CAGAAAT T CAT GGTT CAA GCTAT GGAGA TGGAGCTCTA 
721 TGATGTTTTT CACATTCCAT TTTTCAAGTG GTTGGATCTT ACAGGGAATA TTAAG GCTAT 
781 GAAACAAACT TTCAAAGACA TT GAT AATAT T AT CCAAGGT TGGTTAGATG AGCACATTAA 
841 GAAGAGAGAA ACAAAGGATG TTGGAGGTGA AAACGAACAA GATTTTATAG ATGTGGTGCT 
901 TT CCAAGAT G AGCGACGAAC ATCTTGGCGA GGGTTACTCT CAT GAC AC AA CCAT CAAAGC 
961 AACTGTATTC ACTTTGGTCT TGGATGCAAC AGACACACTT GCACTT CAT A TAAAGTGGGT 
1021 AATGGCGTTA AT GAT AAAC A AT AAG CAT GT CAT GAAGAAA GCACAAGAAG AGAT GGACAC 
1081 AATTGTTGGT AGAGATAGAT GGGTAGAAGA GAGTGATATC AAGAATTTGG TGTATCTCCA 
1141 AGCAATT GT C AAAGAAGTAT TACGATTACA TCCACCCGCA CCTTTGTCAG TGCAACACCT 
1201 AT CT GTAGAA GATTGTGTTG TCAATGGGTA C CAT ATT C CT AAGGGGACTG CACTACTTAC 
1261 CAATATT AT G AAACTACAGC GAGATCCTCA AACAT GGCCA AATCCTGATA AATTCGATCC 
1321 AGAGAGAT T C CTGACGACTC AT GCTACTAT TGACTACCGC GGGCAGCACT ATGAGTCGAT 
1381 CCCCTTTGGT AC GGGGAGAC GAGCTTGTCC CGCGAT GAAT TATTCATTGC AAGT GGAACA 
1441 CCTTTCAATT GCTCATATGA TCCAAGGTTT CAGTTTTGCA ACTACGACCA ATGAGCCTTT 
1501 G GAT AT G AAA CAAGGTGTGG GTTTAACTTT AC CAAAGAAG ACTGATGTTG AAGT GCTAAT 
1561 TACACCTCGC CTTCCTCCTA CGCTTTATCA ATATTAAGAT GTTTTGTTGT CGGGATTCGT 
1621 TCTGATCAAT CCCTCAATG 

SEQ. ID. NO. 226 

1 MVYLLSPISA IVGFVTFSFL FYFLWTKKQS KILNPLPPKI PGGWPVIGHL FYFKNNGDED 

61 RHFSQKLGDL ADKYGPVFTF RLGFRRFLAV SSYEAMKECF TTNDIHFADR PSLLYGEYLC 

121 YNNAMLAVAK YGPYWKKNRK LVNQEVLSVS RLEKFKHVRF SIIQKNIKQL YNCDSPMVKI 

181 NLSDWIDKLT FDIILKMWG KN YNNGHGE I LKVAFQKFMV QAMEMELYDV FHIPFFKWLD 

241 LTGNI KAMKQ TFKDIDNIIQ GWLDEHIKKR ETKDVGGENE QDFIDWLSK MSDEHLGEGY 

301 SHDTTIKATV FT LVLDATDT LALHIKWVMA LMI NNKHVMK KAQEEMDTIV GRDRWVEESD 

361 I KNLVYLQAI VKEVLRLHPP APLSVQHLSV EDCWNGYHI PKGTALLTNI MKLQRDPQTW 

421 PNPDKFDPER FLTTHATIDY RGQHYESIPF GTGRRACPAM NYSLQVEHLS IAHMIQGFSF 

481 ATTTNEPLDM KQGVGLTLPK KTDVEVLITP RLPPTLYQY 
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NAME D125-AF11 

ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO, 227 

1 CTTTTTCTCC CCAAAAAAGA GCTCATTTCC CTTGTCCCCA AAAATGGATC TTCTCTTACT 

61 AGAGAAGACC TTAATTGGTC TCTTCTTTGC CATTTTAATC GCTATAATTG TCTCTAGACT 

121 TCGTTCAAAG CGTTTTAAGC TTCCCCCAGG ACCAATCCCA GTAC CAGTTT TTGGTAATTG 

181 GCTTCAAGTT GGTGATGATT T AAAC CACAG AAATCTTACT GATTTTGCCA AAAAATTTGG 

241 TGATCTTTTC TTGTTAAGAA TGGGCCAGCG TAATTTAGTT GTTGTGTCAT CTCCTGAATT 

301 AGCTAAAGAA GTTTTACACA CACAAGGT GT TGAATTTGGT TCAAGAACAA GAAATGTTGT 

361 AT TT GAT ATT TTTACT GGAA AAGGT CAAGA TATGGTTTTT ACT GTATAT G GTGAACACTG 

421 GAGAAAAATG AGGAGAATTA TGACTGTACC ATTTTTTACT AATAAAGTTG T GCAGCAAT A 

481 TAGAGGGGGG TGGGAGTTTG AAGTGGCAAG TGTAATTGAG GAT GT GAAGA AAAATCCTGA 

541 ATCTGCTACT AATGGGATTG TATTAAGGAG GAGAT T AC AA TT GAT GAT GT ATAATAATAT 

601 GTTTAGGATT AT GTTT GATA GGAGATTTGA GAGT GAAGAT GATCCTTTGT TTGTTAAGCT 

661 TAAGGCTTTG AATGGTGAAA GGAGTAGATT GGCT CAGAGT TT T GAGTATA ATTATGGTGA 

721 TTTTATTCCC ATTTTGAGGC CTTTTTTGAG AGGTTATTTG AAGATCTGTA AAGAAGTTAA 

781 GGAGAAGAGG CTGCAGCTTT T CAAAGATT A CTTTGTTGAT GAAAGAAAGA AGCTTTCAAA 

841 T AC CAAGAGC TT GGACAGCA ATGCTCTGAA ATGTGCGATT GATCACATTC TTGAGGCTCA 

901 ACAGAAGGGG GAGAT CAAT G AGGACAACGT TCTTTACATT GT T GAAAACA TCAATGTTGC 

961 TGCTATAGAA ACCACATTAT GGTCAATTGA GT GGGGTAT C GCCGAGTTAG TCAACCACCC 

1021 T C AC AT C C AA AAGAAACTCC GCGACGAGAT TGACACAGTT CTTGGCCCAG GAGT GCAAGT 

1081 GACTGAACCA GACACCCACA AGCTT C CAT A CCTTCAGGCT GT GAT CAAGG AGACGCTTCG 

1141 TCTCCGTATG GCAATTCCTC TATTAGTCCC ACACATGAAC CTTCACGATG CAAAGCTTGG 

1201 CGGGTTTGAT ATTCCAGCAG AGAGCAAAAT CTTGGTTAAC GCTTGGTGGC TAGCTAACAA 

1261 CCCGGCTCAT TGGAAGAAAC C CGAAGAGTT CAGACCCGAG AGGTTCTTCG AAG AG G AGAA 

1321 GCACGTTGAG GCCAATGGCA AT GACTT CAG ATATCTTCCG TTTGGCGTTG GTAGGAGGAG 

1381 TTGCCCTGGA ATTATACTTG CATT GCCAAT TCTTGGCATT ACTTTGGGAC GTTTGGTTCA 

1441 GAACTTTGAG CTGTTGCCTC CTCCAGGCCA GTCGAAGCTC GACAC CACAG AGAAAGGTGG 

1501 ACAGTT CAGT CTCCATATTT TGAAGCATTC CACCATTGTG TTGAAACCAA GGTCTTGCTG 

1561 AACTTTCTGA TCCTAATCAA TTAAGGGGTT GAAGAAATTT TATAATTATG 

SEQ. ID. NO. 228 

1 MDLLLLEKTL IGLFFAILIA IIVSRLRSKR FKLPPGPIPV PVFGNWLQVG DDLNHRNLTD 
61 FAKKFGDLFL LRMGQRNLW VSSPELAKEV LHTQGVEFGS RTRNWFDIF TGKGQDMVFT 
121 VYGEHWRKMR RIMTVPFFTN KWQQYRGGW E FEVAS VI ED VKKNPESATN GIVLRRRLQL 
181 MMYNNMFRIM FDRRFESEDD PLFVKLKALN GERSRLAQSF EYNYGDFIPI LRPFLRGYLK 
241 I CKEVKEKRL QLFKDYFVDE RKKLSNTKSL DSNALKCAID HILEAQQKGE INEDNVLYIV 
301 ENINVAAIET TLWSIEWGIA ELVNHPHIQK KLRDEI DTVL GPGVQVTEPD THKLPYLQAV 
3 61 IKETLRLRMA IPLLVPHMNL HDAKLGGFDI PAESKILVNA WW LAWN P AH W KKPEEFRPER 
421 FFEEEKHVEA NGNDFRYLPF GVGRRSCPGI ILALPILGIT LGRLVQNFEL LPPPGQSKLD 
481 TTEKGGQFSL HILKHSTIVL KPRSC 
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NAME D134-AE11 
ORGANISM NICOTIANA TABACUM 
SEQ. ID. NO. 229 

1 AACAATAAAA AT GGAGACAT TATTTAACAT CAAAGTT GCA GTTTCATTAG TAATTGTGAT 
. 61 AATTTTTCTG AGATGGGTAT GGAAATTCTT GAATTGGGTG TGGATTCAAC CAAAGAAAAT 
121 GGAAAAAAGA CTAAAAATGG AAGGTTTCAA AGGAAGCTCA TATAAGCTAT TATTT GGAGA 
181 TAT GAAAGAA ATAAATACAA TGGTTGAAGA AGCCAAAACC AAGCCTATGA ATTTTACCAA 
241 TGATTATGTG GCTAGAGTCT TGCCTCACTT CACAAAGTTG ATGCTCCAAT AT GGCAAGAA 
301 TAGCTTTATG TGGTTAGGGC CAAAACCAAC AATGTTTATC ACAGACCCTG AACTAATAAG 
361 GGAGATCTTG TCAAAAAGTT A CAT AT AC C A GGAGATT CAA GGCAATCCAA TCACTAAGTT 
421 GCTAGCACAA GGACTAGTAA GTTAT GAAGC AGAGAAATGG GCTAAGCATA GAAAAAT TAT 
481 CAATCCTGCA TTTCACCTTG ACAAGTTGAA GCATAT GCTA CCATCATTCT ACTTGAGTTG 
541 TTGTGACATG CTCAGAAAAT GGGAAAGTAT AGCTTCATCA GAGGGAT CAG AAATAGACGT 
601 GTGGCCTTTT CT GGAAACGT T GACAAGC GA TGCTATTTCA AGAACAGCTT TTGGTAGTAA 
661 CTATGAAGAC GGGAGACAGA TATTT GAGCT TCAAAAAGAA CAAGCT GAGT TGATTTTACA 
721 AGCAGCGCGA TGGCTTTACA TCCCCGGATG GAGGTTTGTG CCAACAAAGA GGAACAAGAG 
7 81 GAT GAAGCAA AT CGCTAAAG AAGT AC GAT C ATTAGTGTTG GGAATAATCA ATAAGAGAAT 
841 AAGGGAAAT G AAAGCAGGGG AAGCT GCAAA AGATGACTTA CT GGGAATAC TATTGGAATC 
901 TAATTT CAAA GAAAT CCAAA TGCACGGAAA CAAGAACTTT GGCAT GACTA TCGACGAAGT 
961 GATT GAAGAG TGCAAGTTAT TTTACTTTGC TGGGCAAGAA ACTACT T CAG TTTTGCTTGT 
1021 TTGGACTTTG ATTTTACTGA GTAAGCATGT CGATTGGCAA GAAAGAGCTA GAGAAGAAGT 
1081 T CAT CAAGT C TTTGGAAGTA ACAAACCTGA TTAT GACGCA TTGAAT CAGT TGAAAGTTGT 
1141 AACGATGATA TTCAACGAGG TTTTAAGGTT GTACCCACCG GGAATTACCA TAAGTCGAAC 
1201 TGTACACGAG GATACCAAAT TAGGGAACTT GTCATTGCCA GCAGGGATAC AGCTTGTGTT 
1261 ACCTGCAATT TGGTTGCATC ATGACAATGA AATATGGGGA GAT GAT G CAA AGGAGTTCAA 
1321 ACCAGAGAGG TTTAGT GAAG GAGTTAATAA AGCAACAAAG GGTAAATTTG CATATTTTCC 
1381 ATTTAGTTGG GGAC CAAGAA TATGTGTTGG ACTGAATTTT GCAAT GTT AG AGGCAAAAAT 
1441 GGCACTTGCA TTGATTCTAC AACACTAT GC TTTTGAGCTC TCTCCATCTT ATGCACATGC 
1501 TCCTCATACA ATTATCACTC TGCAACCTCA ACATGGTGCT CCTTTGATTT TGCGCAAGCT 
1561 GTAGCGCGGA TAT ATT GAT T GGTTATCTAC TGTAG 

SEQ. ID. NO. 230 

1 METLFNIKVA VSLVIVIIFL RWVWKFLNWV WIQPKKMEKR LKMEGFKGSS YKLLFGDMKE 
61 INTMVEEAKT KPMNFTNDYV ARVLPHFTKL MLQYGKNSFM WLGPKPTMFI TDPELIREIL 
121 SKSYIYQEIQ GNPITKLLAQ GLVSYEAEKW AKHRKI INPA FHLDKLKHML PSFYLSCCDM 
181 LRKWESIASS EGSEIDVWPF LET LTS DAIS RTAFGSNYED GRQIFELQKE QAELILQAAR 
241 WLYIPGWRFV PTKRNKRMKQ I AKEVRS LVL GIINKRIREM KAGEAAKDDL LGILLESNFK 
301 EIQMHGNKNF GMT I DEVI EE CKLFYFAGQE TTSVLLVWTL ILLSKHVDWQ ERAREEVHQV 
361 FGSNKPDYDA LNQLKWTMI FNEVLRLYPP GITISRTVHE DTKLGNLSLP AGIQLVLPAI 
421 WLHHDNEIWG DDAKEFKPER FSEGVNKATK GKFAYFPFSW GPRICVGLNF AMLEAKMALA 
481 LILQHYAFEL SPSYAHAPHT IITLQPQHGA PLILRKL 
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NAME D209-AH12 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 231 

1 ATATGCAACT GAGATTTGAA GAATACCAAC TAACCAAAAT GCAGTTCTTC AGCTTGGTTT 
61 CCATTTTCCT ATTTCTATCT TTCCTCTTTT TGTTAAGGAT ATGGAAGAAC TCCAATAGCC 
121 AAAGCAAAAA GTTGCCACCA GGTCCATGGA AACTACCAAT ACTAGGAAGT AT GCTTCATA 
181 TGGTTGGTGG ACTACCACAC CAT GT C CTTA GAGATTTAGC CAAAAAATAT GGACCACTTA 
241 TGCACCTTCA AT TAGGT GAA GTTTCTGCGG TTGTGGTTAC TTCTCCTGAT ACGGCAAAAG 
301 AAGTATTAAA AACT CAT GAC ATCGCTTTTG CGTCTAGGCC TAGCCTTTTG GCCCCGGAGA 
3 61 TTGTCTGTTA CAATAGGTCT GATCTAGCCT TTTGCCCCTA TGGCGACTAT TGGAGACAAA 
421 TGCGTAAAAT ATGTGTCTTG GAAGTGCTCA GT GCCAAGAA TGTTCGGACA TTTAGCTCTA 
481 TTAGGCGGAA TGAAGTTCTT CGTCTCATTA ATTTTATCCG GTCATCTTCT GGT GAACCTA 
541 TTAAT GTTAC GGAAAGGATC TTTTTGTTCA CAAGCT C CAT GACATGTAGA TCAGCGTTTG 
601 GGCAAGTGTT CAAAGAGCAA GACAAATTTA TACAACTAAT TAAAGAAGTG ATACTCTTAG 
661 CAGGAGGGTT TGATGTGGCT GACAT AT T C C CTTCACTGAA GTTTCTTCAT GTGCTCAGTG 
721 GAAT GAAGGG TAAGATTATG AATGCACACC ATAAGGTAGA TGCCATTGTT GAGAATGTCA 
781 TCAATGAGCA CAAGAAAAAT CTTGCAATTG GGAAAACTAA TGGAGCGTTA GGAGGT GAAG 
841 ATTTAATTGA TGTTCTTCTA AGACTTATGA AT GAT GGAGG CCTTCAATTT CCTATCACCA 
901 ACGACAACAT C AAAG C CAT A ATTTTTGACA TGTTTGCTGC CGGGACAGAG ACTTCATCGT 
961 CAACAATT GT GTGGGCTATG GTAGAAATGG TGAAAAATCC AG C C GT AT T C GCGAAAGCTC 
1021 AAGCAGAAGT AAGAGAAGCA TTTAGAGGAA AAGAAACTTT CGAT GAAAAT GAT GT GGAGG 
1081 AGCTAAACTA CCTAAAGTTA GTAATAAAAG AAACT CTAAG ACTTCATCCA CCGGTTCCAC 
1141 TTTTGCTCCC AAGAGAAT GT AGGGAAGAGA CAAATATAAA CGGCTACACT ATTCCTGTAA 
1201 AGACCAAAGT CATGGTTAAT GTTTGGGCTT TGGGAAGAGA TCCAAAATAT TGGAATGACG 
12 61 CAGAAACTTT TAT GCCAGAG AGATTTGAGC AGTGCTCTAA GGATTTT GTT GGTAATAATT 
1321 TTGAATATCT TCCATTTGGT GGCGGAAGGA GGATTTGTCC TGGGATTTCG TTTGGCTTAG 
1381 CTAATGCTTA TTTGCCATTG GCTCAATTAC TAT AT CACTT CGATTGGAAA CTCCCTGCTG 
1441 GAATCGAACC AAGCGACTTG GACTTGACTG AGTTGGTTGG AGTAACT GCC GCTAGAAAAA 
1501 GTGACCTTTA CTTGGTTGCG ACT CCT TAT C AACCTCCTCA AAAGTGATTT AATGGTTTCA 
1561 AGTTTTTATT TCCTAGCAAA CCCCACTATT GTCCTATCTT TCTTTTGGTG TTTTCGGTTT 
1621 TATCTACTCT AATAC AT GCA TCTTTTACCA TATAGGAATG T AC CAT GTT G TCG 

SEQ. ID. NO. 232 

1 MQLRFEEYQL TKMQFFSLVS IFLFLSFLFL LRIWKNSNSQ SKKLPPGPWK LPILGSMLHM 
61 VGGLPHHVLR DIAKKYGPLM HLQLGEVSAV WTSPDTAKE VLKTHD I AFA SRPSLLAPEI 
121 VCYNRSDLAF CPYGDYWRQM RKICVLEVLS AKNVRTFSSI RRNEVLRLIN FIRSSSGEPI 
181 NVTERI FLFT SSMTCRSAFG QVFKEQDKFI QLIKEVILLA GGFDVADIFP SLKFLHVLSG 
241 MKGKIMNAHH KVDAIVENVI NEHKKNLAIG KTNGALGGED LIDVLLRLMN DGGLQFPITN 
301 DNIKAIIFDM FAAGTETSSS TIVWAMVEMV KNPAVFAKAQ AEVREAFRGK ETFDENDVEE 
361 LNYLKLVIKE TLRLHP PVPL LLPRECREET NINGYTIPVK TKVMVNVWAL GRDPKYWNDA 
421 ETFMPERFEQ CSKDFVGNNF EYLPFGGGRR ICPGISFGIA NAYLPLAQLL YHFDWKLPAG 
481 IEPSDLDLTE LVGVTAARKS DLYLVATPYQ PPQK 
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NAME D221-BB8 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 233 . mm 
1 GAATTATTTC ACGTGTTGTA TTCCTTGTCT AT GATAGGAA GCTCGTTACC TCAGCGTACA 
61 AACCCCAAAT AAAAAATGAA TTTCCTTGTG GTGTTAGCTT CTCTCTTTCT CTTTGTGTTC 
121 CTAAT GAGGA TAAGCAAAGC AAAAAAGCTC CCTCCAGGTC CAAGGAAACT GCCTATAATA 
181 GGAAACCTTC ATCAAATTGG AAAATT AC CT CATCGTTCAC TTCAAAAACT TTCTAATGAA 
241 TAT GGGGATT TCATTTTCTT GCAATTAGGT TCTGTACCGA CTGTGGTTGT CTCCTCAGCT 
301 GACATTGCCC GAGAGATCTT TAGAACTCAC GACCTTGTTT TCTCAGGCCG TCCTGCTTTA 
361 TATGCTGCCA GAAAACTTTC CT ACAATT GC TACAACGTTT CAT TT GCACC CT AT GGTAAT 
421 TACTGGAGAG AGGCTCGGAA AATTCTAGTG TTGGAGTTGC TAAGTACAAA GAGAGTACAA 
481 AGTTT CGAGG CAATTCGAGA CGAGGAAGTA AGTAGCTTGG TT CAAATTAT CTGTAGTTCC 
541 TTGAGCTCAC CTGTTAACAT AAGCACATTA GCACTATCCT TGGCAAATAA CGTTGTTTGT 
601 CGAGTGGCTT TTGGGAAAGG GAGTGCTGAA GGAGGAAATG ATTAT GAGGA TAGGAAGTTT 
661 AATGAAATTC TAT AT GAGAC ACAAGAATTA TTGGGTGAGT TTAACGTTGC TGATTATTTT 
721 CCTCGGATGG CAT GGATTAA CAAAATAAAT GGGTTTGATG AACGATTGGA AAATAATTTT 
781 AGGGAATTGG ATAAGTT TTA TGACAAAGTA AT AGAAGAT C AT CTTAATT C AT GTAGCT GG 
841 ATGAAACAAA GGGATGATGA AGACGTTATT GATGTATTGC TTCGAATTCA AAAGGATCCA 
901 AGCCAAGAAA TTCCTCTCAA AGATGATCAC ATTAAGGGCC TTCTTGCGGA TATATTCATA 
961 GCTGGAACTG AT ACAT CAT C AACAAC CAT A GAAT GGGCAA TGT CAGAACT CATAAAAAAT 
1021 C CAAGAGT CT TGAGAAAAGC TCAAGAGGAA GTTAGAGAAG TTTCTAAGGG AAAACAAAAG 
1081 GTCCAAGAAA GTGATCTTTG CAAACTAGAT TACTTGAAAT T GGT CAT CAA AGAAACCTTT 
1141 AGACTACACC CACCAGTCCC ATT ACTAGT C CCT CGAGTAA CAACAGCCAG CT GCAAAATA 
1201 AT GGAATACG AAATT CCAGT AAATACAAGA GTCTTCATCA AC G C G AC AG C AAAT GGGACA 
1261 AAT CCAAAAT ACTGGGAAAA TCCATTGACA TTCTTGCCAG AGAGATT CTT GGATAAGGAG 
1321 ATTGATTACA GAGGCAAAAA TTTTGAGTTG TTGCCATTTG GGGCAGGGAG AAGAGGGTGT 
1381 CCAGGAATTA ATTTTTCAAT ACCACTTGTT GAGCT T GCAC TTGCTAATCT ATTGTTTCAT 
1441 TATAATTGGT CACTTCCTGA AGGGAT GCTA GCTAAGGATG TTGATATGGA AGAAGCTTTG 
1501 GGGATTACCA TGCACAAGAA ATCTCCCCTT T GCTTAGTAG CTTCTCATTA TACTTGTTGA 
1561 GATTTTAAAA GATTTTAGCA TAGCTATATA TAGCTTGAAG T 

SEQ . ID. NO. 234 

1 MNFLWLASL FLFVFLMRIS KAKKLPPGPR KLPIIGNLHQ IGKLPHRSLQ KLSNEYGDFI 
61 T^LQLGSVPTV WS SAD I ARE I FRTHDLVFS GRPALYAARK LSYNCYNVSF APYGNYWREA 
121 RKILVLELLS TKRVQSFEAI RDEEVSSLVQ IICSSLSSPV NISTLALSLA NNWCRVAFG 
181 KGSAEGGNDY EDRKFNEILY ETQELLGEFN VADYFPRMAW INKINGFDER LENNFRELDK 
241 FYDKVIEDHL NSCSWMKQRD DEDVI DVLLR IQKDPSQEIP LKDDHIKGLL ADIFIAGTDT 
301 SSTTIEWAMS ELI KNPRVLR KAQEEVREVS KGKQKVQESD LCKLDYLKLV IKETFRLHPP 
361 VPLLVPRVTT ASCKIMEYEI PVNTRVFINA TANGTNPKYW ENPLTFLPER FLDKEIDYRG 
421 KNFELLPFGA GRRGCPGINF SIPLVELALA NLLFHYNWSL PEGMLAKDVD MEEALGITMH 
481 KKSPLCLVAS HYTC 
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NAME D222-BH4 
ORGANISM N I COT I ANA 
SEQ. ID. NO. 235 

1 CAAAGACTAA 
61 TTTCAAATCA 
121 GCTTCCGTTA 
181 ACTAT CAAAT 
241 CGCTT CATCA 
3 01 CAAACCCATT 

3 61 CCCATATGGG 
421 CAAAAGAGTC 

4 81 GATTAGGGAA 
541 CGACATAGTT 
601 ATTCAAGTCT 
661 CATGCCGTGG 
721 GGCGAAAGAG 
781 AT CAGACACT 
841 AAACAAGGCT 
901 TGCTGCGGGA 
961 AAATCCAAAA 

1021 AGAGGTAACA 
1081 TTCCAGGCTA 
1141 TTTGGGCTAC 
1201 AGATCCAAAC 
12 61 CATAGATTAC 
1321 CCCTGGCATC 
1381 TTTTGATTTC 
1441 TGGAATTACT 
1501 TTTATTTTAG 
1561 ATGCTTGGTA 

SEQ. ID. NO. 236 

1 MSVFAVISFF 
61 HGPMMMLQFG 
121 YWRNARS I CM 
181 RVALGRKYCD 
241 DAFLEDVIEE 
301 DTTSTLLEWT 
361 SPVPLLPREA 
421 GLHFELIPFG 
4 81 RRK FPL LAVA 



TABACUM 



AAGATGTCGG 
TATTTGCCCT 
AT CGGT CACT 
GAACATGGTC 
GCTGAAGCTG 
TCAACCATTC 
GATTACT GGA 
CAGTCTTTTC 
TCGCCAAATT 
TGCAGGGTGG 
TTGCTGTTAG 
CTTGCATGGA 
TTTGATGCAT 
GAAGCT GAAG 
GGTTTTCAAG 
ACAGATACAA 
ACATTGAATA 
GAGGATGACT 
CACTCTCCAG 
GATATAGCTG 
CTTTGGGAAA 
AAAGGCTTAC 
ACATTTGCTA 
TCGCTACCAA 
GTTAGAAGGA 
AGCTCATTCT 



TCTTTGCGGT 
CATCGAAAAC 
TCCACAAACT 
CCAT GAT GAT 
CTTCCGAAAT 
CTAGCAAGCT 
GGAATGCCAG 
GAAAGATAAG 
CAGAAGT CGA 
CCTTAGGAAG 
AGTTTGTGGA 
TGAATCGTTT 
TTTTGGAGGA 
GGGCAGACTT 
TCGAAATGGA 
CTTCCACGCT 
AGTT GAGAGA 
TAGAGAAAAT 
TGCCACTTCT 
CAGGGACTCA 
AT CCAGAGGA 
ATTT CGAGTT 
AGTTTGTGAA 
AAGGAGTTAA 
AGTTCCCCCT 
AT GCCTTAAA 



TATTTCATTC 
AAAGAAAAAT 
AGGCTTACAA 
GCTTCAATTC 
CAT GAAAACC 
TTTCTTCGGC 
AAGCATTTGC 
GGAAGAAGAG 
TTTAACGGAG 
GAAGTATTGT 
ATTGTTGGGA 
CAATGGTTTG 
TGTGATTGAG 
CGTGGATATA 
TGCAATCAAA 
TCTAGAGTGG 
T GAG GT GAGA 
GCCGTATTTA 
ACCTCGAGAA 
AGTCCTCGTT 
GTTTCAACCT 
AATTCCATTC 
TGAGCTAGCA 
GCAT GAGGAT 
TTTAGCCGTC 
AACTACTACT 



TTTCTACTTC 
TCTCCACCAT 
CCTCACCGTT 
GGTAGC GTAC 
CAAGATTTGT 
CCAAAGGACG 
ATGCTTCAGC 
ACTTCTCTTC 
CTGTTCGTTT 
GAT GGGGAAG 
GTTTTTAACA 
AAT GCCAAAG 
GAACAC GGAG 
TTATT GCAGG 
GCTATTATCA 
ACAAT GAACG 
CAAGTGACTC 
AGAG CAGCAG 
GCAATTAAGG 
TGTCCATGGG 
GAAAGATTCT 
GGTGCAGGTC 
TTGGCAAGAT 
TTGGACGTGG 
GCCACTCCAT 
AGATAACT GC 



TGTTTTTTCT 
CTCCTTCAAA 
CTCTACAAAA 
CTGTGCTTAT 
CTTTTGCAAA 
TTGCCTTCAC 
TTTTGAACAA 
TTCTCCAGAG 
CCATGACTAA 
AAGGGAGGAA 
TTGGAGATTA 
TGGATAAAGT 
GAAATAAGAA 
TT CACAAAGA 
TGGATATGTT 
AGCTCTTAAG 
AAGGGAAGAC 
TTAAGGAGAG 
AT GCAAAGGT 
CAATCTCAAG 
T GGAT ACT T C 
GGAGGGGTTG 
TAATGTTCCA 
AGGAAGCT GC 
GCTCGTGATT 
GTAGTAAATA 



LLLFFLFKSY 
SVPVLIASSA 
LQLLNNKRVQ 
GEEGRKFKSL 
HGGNKK5DTE 
MNELLRNPKT 
I KDAKVLGYD 
AGRRGCPGIT 
TPCS 



LPSSKTKKNS 
EAASEIMKTQ 
SFRKIREEET 
LLEFVELLGV 
AEGADFVDIL 
LNKLRDEVRQ 
IAAGTQVLVC 
FAKFVNELAL 



PPSPSKLPLI 
DLSFANKPIS 
SLLLQRIRES 
FNIGDYMPWL 
LQVHKENKAG 
VTQGKTEVTE 
PWAISRDPNL 
ARLMFHFDFS 



GHFHKLGLQP 
TIPSKLFFGP 
PNSEVDLTEL 
AWMNRFNGLN 
FQVEMDAIKA 
DDLEKMPYLR 
WENPEEFQPE 
LPKGVKHEDL 



HRSLQKLSNE 
KDVAFTPYGD 
FVSMTNDIVC 
AKVDKVAKEF 
I IMDMFAAGT 
AAVKESSRLH 
RFLDTSIDYK 
DVEEAAGITV 
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NAME D224-AF10 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 237 . 

1 ATTATCCATC ACCTAAAATG GAGAATTCTT GGGTTTTTCT AGCCTTGGCA GGGCTATCTG 
61 CATTAGCTTT TCTCTGTAAA ATAATCACCT GTCGAAGACC GGTTAACCGG AAAATACCAC 
121 CAGGTCCAAA ACCATGGCCC ATCATTGGCA ATTTGAACCT ACTTGGTCCT AT AC CAC AT C 
181 AATCTTTTGA CTTGCTTTCC AAAAAATATG GAGAGTT GAT GCTGCTGAAA TTTGGCTCCA 
241 GGCCAGTTCT TGTTGCTTCA TCTGCTGAAA TGGCAAAACA GTTTTTAAAA GT AC AT GAT G 
301 CTAATTTCGC CTCCCGTCCT ATGCTAGCTG GT GGAAAGTA TACAAGCTAT AACTATTGTG 
361 AC AT GACAT G GGCACCCTAT GGTCCCTATT GGCGCCAAGC ACGACGAATT T AC CTTAAC C 
421 AGATATTTAC TCCGAAAAGG CTAGACT CGT T C GAGT AC AT TCGTGTTGAA GAAAGGCAGG 
481 CCTTGATTTC CCAGCTGAAT TCCCTTGCTG GAAAGCCATT TTTTCTCAAA GACCATTTGT 
541 CGCGATTTAG CCTCTGCAGC AT GACAAGGA TGGTTTTGAG CAACAAGTAC TTTGGTGAAT 
601 CAACAGTTAG AGTAGAAGAT TT GCAGTAC C TGGTAGATCA ATGGTTCTTA CTTAATGGTG 
661 CTTTCAACAT TGGAGATTGG ATTCCATGGC TCAGCTTCTT GGACCTACAA GGCTATGTGA 
721 AACAAATGAA GGCTTT GAAA AGAACTTTTG ATAAGTTCCA CAACATT GT G CTAGAT GAT C 
7 81 GCAGGGCTAA GAAGAAT GCA GAGAAGAACT TTGTCCCAAA AGACAT GGTT GATGTCTTGT 
841 TGAAGATGGC TGAAGATCCT AATCTGGAAG TCAAACTCAC TAATGACTGT GTCAAAGGGT 
901 TAAT GCAGGA TTTACTAACT GGAGGAACAG ATAGCTTAAC AGCAGCAGTG CAATGGGCAT 
961 TTCAAGAACT TCTTAGACGG CCAAGGGTTA TTGAGAAGGC AACCGAAGAG CTTGACCGGA 
1021 TTGTCGGGAA AGAGAGATGG GTAGAAGAGA AAGATTGCTC GCAGCTATCT TACGTTGAAG 
1081 CAATCCTCAA GGAAACACTA AGGTTACATC CTCTAGGAAC TAT GCTAGCA CCGCATTGTG 
1141 CTATAGAAGA TTGTAACGTG GCTGGTTATG ACATACAGAA AGGAAC GACC GTTCTGGTGA 
1201 ATGTTTGGAC CAT T GGAAGG GACCCAAAAT ACT GGGATAG AGCACAAGAG TTTCTCCCCG 
12 61 AGAGATTCTT AGAGAACGAC ATT GAT AT GG AC GGACAT AA CTTTGCTTTC TTGCCATTTG 
1321 GCTCGGGGCG AAGGAGGTGC CCTGGCTATA GCCTT GGACT TAAGGT TAT C CGAGTAACAT 
1381 TAGCCAACAT GTTGCATGGA TTCAACTGGA AATTACCTGA AGGTAT GAAG C C AG AAG AT A 
1441 TAAGTGTGGA AGAACATTAT GGGCTCACTA CAC AT C CT AA GTTTCCTGTT CCTGTGATCT 
1501 TGGAATCTAG ACTTTCTTCA GATCTCTATT CCCCCATCAC TTAAT CCTAA GTGCTTCCTA 
15 61 TTATAGCATC ATATCAATAT CCCTC 

SEQ. ID. NO. 238 

1 MEN S WVFLAL AGLSALAFLC KIITCRRPVN RKIPPGPKPW PIIGNLNLLG PIPHQSFDLL 
61 SKKYGELMLL KFGSRPVLVA SSAEMAKQFL KVHDANFASR PMLAGGKYTS YNYCDMTWAP 
121 YGPYWRQARR IYLNQIFTPK RLDSFEYIRV EERQALISQL NSLAGKPFFL KDHLSRFSLC 
181 SMTRMVLSNK YFGESTVRVE DLQYLVDQWF LLNGAFNIGD WIPWLSFLDL QGYVKQMKAL 
241 KRT FDKFHNI VLDDRRAKKN AEKNFVPKDM VDVLLKMAED PNLEVKLTND CVKGLMQDLL 
301 TGGTDSLTAA VQWAFQELLR RPRVIEKATE ELDRIVGKER WVEEKDCSQL SYVEAILKET 
361 LRLHPLGTML APHCAIEDCN VAGYDIQKGT TVLVNVWTIG RDPKYWDRAQ EFLPERFLEN 
421 DIDMDGHNFA FLPFGSGRRR CPGYSLGLKV IRVTIANMLH GFNWKLPEGM KPEDISVEEH 
481 YGLTTHPKFP VPVILESRLS SDLYSPIT 
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NAME D224-BD11 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 239 

1 CTCATTATCC AT CAC CTAAA AT GGAGAATT CTTGGGTTTT TCTAGCCTTG GCAGGGCTAT 
61 CTGCATTAGC TTTTCTCTGT AAAATAATCA CCTGTCGAAG ACCGGTTAAC CGGAAAATAC 
121 CACCAGGTCC AAAACCATGG CCCATCATTG GCAATTTGAA CCTACTTGGT CCTATACCAC 
181 ATCAATCTTT TGACTTGCTT TCCAAAAAAT AT GGAGAGTT GATGCTGCTG AAATTTGGCT 
241 CCAGGCCAGT TCTTGTTGCT TCATCTGCTG AAAT GGCAAA ACAGTTTTTA AAAGT AC AT G 
301 AT GCTAAT TT CGCCTCCCGT CCTATGCTAG CTGGTGGAAA GTATACAAGC TATAACTATT 
361 GTGACATGAC ATGGGCACCC TATGGTCCCT ATTGGCGCCA AGCACGACGA CGAATTTACC 
421 TTAACCAGAT ATTTACTCCG AAAAG GCTAG ACTCGTTCGA GTACATT CGT GTT GAAGAAA 
4 81 GGCAGGCCTT GATTTCCCAG CTGAATTCCC TTGCTGGAAA GCCATTTTTT CTCAAAGACC 
541 ATTTGTCGCG ATTTAGCCTC TGCAGCATGA CAAGGAT GGT TTTGAGCAAC AAGTATTTTG 
601 GTGAATCAAC AGTTAGAGTA GAAGATTTGC AGTACCTGGT AGATCAATGG TTCTTACTTA 
661 ATGGTGCTTT CAACATT GGA GATTGGATTC CAT GGCT CAG CTTCTTGGAC CTACAAGGCT 
721 AT GT GAAACA AATGAAGGCT TTGAAAAGAA CTTTTGATAA GTT C CACAAC ATTGTGCTAG 
781 AT GAT CAC AG GGCTAAGAAG AAT GCAGAGA AGAACTTTGT CCCAAAAGAC AT GGTT GAT G 
841 TCTTGTTGAA GAT GGCT GAA GATCCTAATC TGGAAGTCAA ACT CACTAAT GACTGTGTCA 
901 AAGGGTTAAT GCAGGATTTA CTAACT G GAG GAACAGATAG CTTAACAGCA GCAGT GCAAT 
961 GGGCATTTCA AGAACTTCTT AGACAGCCAA GGGTTATTGA GAAGGCAACC GAAGAGCT T G 
1021 ACCGGATTGT CGGGAAAGAG AGATGGGTAG AAGAGAAAGA TTGCTCGCAG CTAT CTTACG 
1081 TTGAAGCAAT CCT CAAGGAA ACACTAAGGT TACATCCTCT AGGAACT AT G CTAGCACCGC 
1141 ATT GT GCTAT AGAAGATT GT AACGTGGCTG GTTAT GACAT ACAGAAAGGA ACGACCGTTC 
1201 TGGTGAATGT TT GGAC CAT T GGAAGGGACC CAAAAT ACT G GGATAGAGCA CAAGAGTTTC 
12 61 TCCCCGAGAG ATTCTTAGAG AAC GACATT G ATAT GGACGG ACATAACTTT GCTTTCTTGC 
1321 CATTTGGCTC GGGGCGAAGG AGGTGCCCTG GCTATAGCCT TGGACTTAAG GTTATCCGAG 
1381 TAACAT TAGC CAACATGTTG CAT GGATTCA ACT GGAAATT ACCTGAAGGT AT GAAGCCAG 
1441 AAGATATAAG T GT GGAAGAA CAT TAT G GGC TCACTACACA TCCTAAGTTT CCTGTTCCTG 
1501 T GAT CT T GGA AT CT AGACTT TCTTCAGATC TCTATTCCCC CAT CACTTAA TCCTAAGTGC 
1561 TT C CTAT TAT AGCAT CATAT CAATAT C CCT C 

SEQ. ID. NO. 240 

1 MENSWVFLAL AGLSALAFLC KIITCRRPVN RKIPPGPKPW PIIGNLNLLG PIPHQSFDLL 
61 SKKYGELMLL KFGSRPVLVA SSAEMAKQFL KVHDANFASR PMLAGGKYT S YNYCDMTWAP 
121 YGPYWRQARR RIYLNQIFTP KRLDSFEYIR VEERQALI SQ LNSLAGKPFF LKDHLSRFSL 
181 CSMTRMVLSN KYFGESTVRV EDLQYLVDQW FLLNGAFNIG DWIPWLSFLD LQGYVKQMKA 
241 LKRTFDKFHN IVLDDHRAKK NAEKNFVPKD MVDVLLKMAE DPNLEVKLTN DCVKGLMQDL 
301 LTGGTDSLTA AVQWAFQELL RQPRVIEKAT EELDRIVGKE RWVEEKDCSQ LSYVEAILKE 
361 TLRLHPLGTM LAPHCAIEDC NVAGYD I QKG TTVLVNVWTI GRDPKYWDRA QEFLPERFLE 
421 NDIDMDGHNF AFLPFGSGRR RCPGYSLGLK VIRVTLANML HGFNWKLPEG MKPEDI SVEE 
4 81 HYGLTTHPKF PVPVILESRL SSDLYSPIT 
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NAME D228-AD7 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 241 

1 TGATAATGCT CTTTCTACTC TTTGTAGCCC 
61 AATTCAAAAA TGGTGGAAAT AACAGATTGC 
121 GAAATTTGCA TCAATACGAT AGTATAACT C 
181 AATAT GGCAA AATCTTCTCA TTAAAACTTG 
241 CAAAAT TAGC AAAAGAAGTA TTGAAAAAAC 
3 01 TTCTTGGCCA ACAAAAACTG TCTTATTATG 

3 61 GAGAAAT GAG AAAAATTTGT GTTCTTCATC 
421 GTCCAATTCG TGAAGATGAA GTTTTTAGAA 

4 81 CTT CACAAAT TATTAATTTG AGTAATTTAA 
541 GAGTTGCTTT TGGTGTTAGG ATT GAAGAAG 
601 TTTTGGCCGA GGCACAAGAA AT GAT GGCTA 
661 TAAGTT GGAT TGATAAATTA AGTGGATTGA 
721 TGGATAATTT T TAT GAAGAA CTCATTGAGC 
781 TGGAAGGAGA TATTGTTGAT CTTTTGCTAC 
8 41 ATCTCACTAT G GAAGAT AT A AAAGGAATT C 
901 CTAGT GCAGC TGCTACTGTT TGGGCAATGA 
961 AAAAAGTTCA ATTAGAAATC AGAAAAT CAG 

1021 AT GT CCAAAA CATCCCTTAT TTTAAAGCAG 
1081 CAGCTCCACT TTTAGTTCCA AGAGAAT CAA 
1141 TTCGGCCAAG AACCATAGTT CAT GTTAAC G 
1201 GGGAAAATCC AGAT GAATTT ATACCTGAGA 
1261 GT CAAGATTT TGAGTTACTT CCATTTGGTG 
1321 TTGGGGTTGC AT CCAT GGAA CTTGCTTTGT 
1381 TGCCTTATGG AGTAAAAAAA G AAGAC AT C G 
1441 ACAAGAAAAA CGAACTTTGC CTTGTCCCAA 
1501 GATCTCATGC TAGTTCTGTG CGGTCAGCTA 
1561 ACATAATTAG TACAT GTTT A AAATGTATAA 

SEQ. ID. NO. 242 

1 ML FLL FVAL P FILIFLLPKF KNGGNNRLPP 
61 GKIFSLKLAS TNWWSSAK LAKEVLKKQD 
121 MRKI CVLHLF SLKKVQLFSP IREDEVFRMI 
181 AFGVRI EEEA HARKRFD FLL AEAQ EMMAS F 
241 NFYEELIEQH QNPNKPKYME GDIVDLLLQL 
301 AAATVWAMTA LIKNPKAMEK VQLEIRKSVG 
361 PLLVPRESME KTILEGYEIR PRTIVHVNAW 
421 DFELLPFGAG RRGCPGIALG VASMELALSN 
481 KNELCLVPKN YL 



TTCCTTTCAT TCTTATTTTT CTTCTTCCTA 
CACCAGGTCC TATAGGTTTA CCATT CAT T G 
CTCATATCTA TTTTTGGAAA CTTT CAAAAA 
CTTCTACTAA T GT GGTAGTA GTTT CTT CAG 
AAGATTTAAT ATTTTGTAGT AGACCATCTA 
GT CGT GAT AT TGCTTTTAAT GATTATTGGA 
TTTTTAGTTT AAAAAAAGTT CAATTATTTA 
TGATTAAGAA AATAT CAAAA CAAGCTTCTA 
TGATTTCATT AACAAGTACA ATTATTTGTA 
AAGCACAT GC AAGGAAGAGA TTTGATTTTC 
GTTTCTTTGT ATCTGATTTT TTTCCCTTTT 
CATATAGACT TGAGAGGAAT TTCAAGGATT 
AAC AT CAAAA TCCTAATAAG CCAAAATATA 
AATT GAAGAA AGAGAAATTA AC AC C ACT T G 
T CAT GAAT GT GTTAGTT GCA GGAT CAGACA 
CAGCCTTGAT AAAGAAT CCT AAAGC CAT GG 
TTGGGAAGAA AGGCATT GTA AAT GAAGAAG 
TGATAAAGGA AATATTTAGA TTGTATCCAC 
TGGAAAAAAC CATATTAGAA GGTTAT GAAA 
CTTGGGCTAT AGCAAGGGAT CCT GAAAT AT 
GATTTTTGAA . TAG CAG TAT C GATTACAAGG 
CAGGCAGAAG AGGTTGCCCA GGTATT GCAC 
CAAATCTTCT TTATGCATTT GATTGGGAGT 
AC ACAAAC GT TAGGCCTGGA ATT GC CAT GC 
AAAATTATTT ATAAATTATA TTGGGACGTG 
AGCTTATTAT TTTTGGCTCA AATTAT GT AT 
ATATAGTAGA ACCATTCTCA TGGTT 



GPIGLPFIGN LHQYDSITPH IYFWKLSKKY 
LIFCSRPSIL GQQKLSYYGR DIAFNDYWRE 
KKISKQASTS QIINLSNLMI SLTSTIICRV 
FVSDFFPFLS WIDKLSGLTY RLERNFKDLD 
KKEKLTPLDL TMEDIKGILM NVLVAGSDTS 
KKGIVNEEDV QNI PYFKAVI KEIFRLYPPA 
AIARDPEIWE NPDEFIPERF LNSSIDYKGQ 
LLYAFDWELP YGVKKEDIDT NVRPGIAMHK 



WO 2004/035745 
FIG. 122 



63/111 



PCT/US2003/032722 



NAME D228-AH8 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID - NO. 243 

1 TGATAATGCT CTTTCTACTC TTTGTAGCCC TTCCTTTCAT TCTTATTTTT CTTCTTCCTA 
61 AATTCAAAAA TGGTGGAAAT AACAGATTGC CACCAGGTCC TATAGGTTTA C CATT CAT T G 
121 GAAATTTGCA TCAATATGAT AGTATAACTC CT CAT AT CTA TTTTTGGAAA CTTTCCAAAA 
181 AATAT GGCAA AATCTTCTCA TTAAAACTTG CTTCTACTAA TGTGGTAGTA GTTTCTTCAG 
241 CAAAAT TAGC AAAAGAAGTA TTGAAAAAAC AAGATTTAAT ATTTT GTAGT AGACCAT CTA 
301 TTCTTGGCCA ACAAAAACTG TCTTATTATG GTCGT GAT AT TGCTTTTGCA CCTTATAATG 
361 ATTATT GGAG AGAAAT GAGA AAAATTTGTG TTCTTCATCT TTTTAGTTTA AAAAAAGTTC 
421 AATTATTTAG TCCAATTCGT GAAGAT GAAG TTTTTAGAAT GAT T AAGAAA ATATCAAAAC 
481 AAGCTTCTAC TTCACAAATT ATTAATTTGA GTAATTTAAT GATTTCATTA ACAAGTACAA 
541 TTATTTGTAG AGTTGCTTTT GGTGTTAGGT TTGAAGAAGA AGCACAT GCA AGGAAGAGAT 
601 TTGATTTTCT TTTGGCCGAG GCACAAGAAA T GAT GGCTAG TTTCTTTGTA TCTGATTTTT 
661 TTCCCTTTTT AAGTTGGATT GATAAATTAA GTGGATTGAC ATATAGACTT GAGAGGAATT 
721 TCAAGGATTT GGATAATTTT TAT GAAGAAC T CATT GAG CA AC AT CAAAAT CCTAATAAGC 
781 CAAAATATAT GGAAGGAGAT ATTGTTGATC TTTTGCTACA ATT GAAGAAA GAGAAATTAA 
841 C AC C ACT T GA TCTCACTATG GAAGATATAA AAGGAATTCT CATGAATGTG TTAGTT GCAG 
901 GATCAGACAC TAGT GCAGCT GCTACTGTTT GGGCAATGAC AGCCTTGATA AAGAAT CCTA 
961 AAGC CAT GGA AAAAGTTCAA TTAGAAATCA GAAAATCAGT TGGGAAGAAA GGCATTGTAA 
1021 AT GAAGAAGA TGTCCAAAAC ATCCCTTATT TTAAAGCAGT GATAAAGGAA ATATTTAGAT 
1081 TGTATCCACC AGCT CCACTT TTAGTT CCAA GAGAATCAAT GGAAAAAACC ATATTAGAAG 
1141 GTTATGAAAT TCGGCCAAGA ACCATAGTTC ATGTTAACGC TTGGGCTATA GCAAGGGATC 
1201 CT GAAATAT G GGAAAAT CCA GATGAATTTA TACCTGAGAG ATTTTTGAAT AGCAGTATCG 
1261 ATTACAAGGG TCAAGATTTT GAGTTACTTC CATTTGGTGC AGGCAGAAGA GGTTGCCCAG 
1321 GTATT GCACT TGGGGTTGCA TCCATGGAAC TTGCTTTGTC AAATCTTCTT TATGCATTTG 
1381 ATTGGGAGTT GCCTTATGGA GTGAAAAAAG AAGACATCGA CACAAACGTT AGGCCT GGAA 
1441 TT GC CAT G CA CAAGAAAAAC GAACTTTGCC TTGTCCCAAA AAATTATTTA TAAATTATAT 
1501 TGGGACGTGG ATCTCATGCT AGTTCTGTGC GGT CAGCTAA GCTTATTATT TTTGGCTCAA 
1561 ATTAT GTATA CATAATTAGT ACAT GTTTAA AATGTATAAA TATAGTAGAA CCATT CT CAT 
1621 GGTT 

SEQ. ID. NO. 244 

1 MLFLLFVALP FILIFLLPKF KNGGNNRLPP GPIGLPFIGN 
61 GKIFSLKLAS TNWWSSAK LAKEVLKKQD LIFCSRPSIL 
121 WREMRKI CVL HLFSLKKVQL FSPIREDEVF RMIKKISKQA 
181 CRVAFGVRFE EEAHARKRFD FLLAEAQEMM ASFFVSDFFP 
241 DLDNFYEELI EQHQNPNKPK YMEGDIVDLL LQLKKEKLTP 
301 DTSAAATVWA MTALIKNPKA MEKVQLEIRK SVGKKGIVNE 
361 PPAPLLVPRE SMEKTILEGY EIRPRTIVHV NAWAIARDPE 
421 KGQDFELLPF GAGRRGCPGI ALGVASMELA LSNLLYAFDW 
481 MHKKNELCLV PKNYL 



LHQYDSITPH 
GQQKLSYYGR 
STSQIINLSN 
FLSWIDKLSG 
LDLTMEDIKG 
EDVQNIPYFK 
IWENPDEFIP 
ELPYGVKKED 



IYFWKLSKKY 
DIAFAPYNDY 
LMISLTSTII 
LTYRLERNFK 
ILMNVLVAGS 
AVIKEIFRLY 
ERFLNSSIDY 
I DTNVRP G I A 
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NAME ' D235-AB1 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 245 

1 AAAATT CAT A ATGGTTTTTC CCATAGAAGC CTTT GTAGGA CTAGTAACCT TCACATTTCT 
61 CTTATACTTC CTATGGACAA AAAAATCTCA AAAACTTCCA AAACCCTTAC TACCGAAAAT 
121 CCCCGGAGGA TGGCCGGTAA TCGGCCATCT TTTTCACTTC AATAACGACG GCGACGACCG 
181 TCCATTAGCT CGAAAACTCG GAGACTTAGC TGATAAATAC GGCCCCGTTT TCACTTTTCG 
241 GCTAGGTCTT CCCCTTGTGC TAGTT GTAAG CAGTTACGAA GCTATAAAAG ATTGCTTCTC 
301 T ACAAAT GAC GCCATTTTCT CCAATCGTCC AGCTTTTCTT TACGGCGAAT ACCTTGGCTA 
361 CAATAATACA ATGCTTTTTC TAGCAAATTA CGGACCTTAC TGGCGAAAAA AT CGTAAATT 
421 AGT CAT T CAG GAAGTTCTCT CTGCTAGTCG TCTCGAAAAA TTCAAACAAG TGAGATTCAC 
481 CAGAATT CAA AC GAGCAT TA AGAATTTATA CACTCGAATT AAT GGAAATT CGAGTACGAT 
541 AAATCTAACT GATT GGT TAG AAGAATT GGA TTTTGGTCTG AT CGT GAAAA TGATCGCTGG 
601 G AAAAAT TAT GAATCCGGTA AAGGAGATGA ACAAGT GGAA AGATTTAAGA ATGCGTTTAA 
661 GGATTTTAT G GTTTTATCAA TGGAATTTGT ATTAT GGGAT GCATTTCCAA TTCCATTATT 
721 TAAATGGGTG GATTTTCAAG GTCATATTAA G G C AAT G AAA AGGACATTTA AGGATATAGA 
781 TTCTGTTTTT CAGAACT GGT TAGAGGAACA TATTAATAAA AGAGAAAAAA TGGAGGTTGG 
841 TGCAGAAGGG AAT G AACAAG ATTTCATTGA TGTGGTGCTT TCAAAATTGA GTAAAGAATA 
901 TCTTGATGAA GGTTACTCTC GTGATACTGT CATTAAAGCA AC AGTTT TTA GTTTGGTCTT 
961 GGAT GCAGCA GACACAGTTG CTCTTCACAT AAATTGGGGA AT G ACAT TAT TGATAAACAA 
1021 T CAAAAT GC C TT GAT GAAAG CACAAGAAGA GATAGACACA AAAGTT GGTA AGT AT AG AT G 
1081 GGTAGAAGAG AGT GATAT TA AGGATTTAGT AT AC CT C CAA GCTATTGTTA AAAAGGT GTT 
1141 AC GATTATAT CCACCAGGAC CTTTGTTAGT ACCACAT GAA TAT GTAAAGG ATTGTGTTGT 
1201 T AGT GGAT AT CACATTCCTA AAGGGACTAG ATTATT CGCA AACGTCATGA AACTGCAGCG 
' 1261 CGAT CCTAAA CTCTTGTCAA AT CCT GAT AA GTT C GAT CCA GAGAGATTCA TCGCTGGTGA 
1321 TAT CGACTT C CGTGGTCACC ACTAT GAGTT TATCCCATTT GGTTCTGGAA GACGATCTTG 
1381 TCCGGGGATG ACTT AT GCAT T GCAAGT GGA ACACCTAACA AT GGCACATT TAAT CCAGGG 
1441 TTTCAATTAC AAAACT CCAA AT GAC GAGGC CTT GGAT AT G AAGGAAGGT G CAGGCATAAC 
1501 AATACGTAAG GTAAATCCGG T GGAATT GAT AATAACGCCT CGCTT GGCAC CTGAGCTTTA 
1561 CTAAAACCTA AGAT CTTT CA TCTTGGTTGA TCATTGTTTA ATACTCCTAG ATAGAT GGGT 
1621 ATT CATC 



SEQ, ID. NO. 246 

1 MVFP I EAFVG LVTFTFLLYF LWTKKSQKLP KPLLPKIPGG WPVI GHLFHF NNDGDDRPLA 
61 RKLGDLADKY GPVFTFRLGL PLVLWSSYE AIKDCFSTND AIFSNRPAFL YGEYLGYNNT 
121 MLFLANYGPY WRKNRKLVIQ EVLSASRLEK FKQVRFTRIQ TSIKNLYTRI NGNSSTINLT 
181 DWLEELDFGL IVKMIAGKNY ESGKGDEQVE RFKNAFKDFM VLSMEFVXWD AFPIPLFKWV 
241 DFQGHIKAMK RTFKDIDSVF QNWLEEHINK REKMEVGAEG NEQDFIDWL SKLSKEYLDE 
301 GYSRDTVIKA TVFS LVLDAA DTVALHINWG MTLLINNQNA LMKAQEEIDT KVGKYRWVEE 
361 SDIKDLVYLQ AIVKKVLRLY PPGPLLVPHE YVKDCWSGY HIPKGTRLFA NVMKLQRDPK 
421 LLSNPDKFDP ERFIAGDIDF RGHHYEFIPF GSGRRSCPGM TYALQVEHLT MAHLIQGFNY 
481 KTPNDEALDM KEGAGITIRK VNPVELIITP RLAPELY 
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NAME D243-AA2 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 247 

1 CAAAAAATCA TTTCTCTCGT CTAAAAT GGA TCTTCTCTTA CTAGAGAAGA CCTTAATTGG 
61 TCTTTTCTTT GCCATTTTAA TCGCTTTAAT T GT CT CT AAA CTTCGTTCAA AGCGTTTTAA 
121 GCTTCCTCCA GGACCAATTC CAGTACCAGT TTTTGGTAAT TGGCTTCAAG TTGGTGATGA 
181 TTTAAACCAC AGAAATCTTA CTGATTATGC CAAAAAATTT GGCGATCTTT TCTTGTTAAG 
241 AATGGGTCAA CGTAACTTAG TTGTTGTGTC ATCTCCTGAA TTAGCTAAAG AAGTTTTACA 
301 CACACAAGGT GTTGAATTTG GTTCAAGAAC AAGAAATGTT GTGTTTGATA TTTTTACTGG 
361 AAAAGGTCAA GATATGGTTT TTACT GTAT A TGGTGAACAT TGGAGAAAAA TGAGGAGAAT 
421 TATGACTGTA CCATTTTTTA CTAATAAAGT T GT GCAACAG TATAGAGGGG GGTGGGAGTT 
481 TGAGGTGGCA AGTGTAATTG AGGATGTGAA AAAAAATCCT GAATCTGCTA CTAAT GGGAT 
541 CGTATTAAGG AGGAGATTAC AATTAAT GAT GTATAATAAT AT GTTTAGGA TTATGTTTGA 
601 TAGGAGATTT GAGAGT GAAG AT GAT CCTTT GTTTGTTAAG CTTAAGGCTT TGAATGGTGA 
661 AAGGAGTAGA TTGGCTCAAA GTTTTGAGTA TAATTAT GGT GATTTTATTC CAATTTTGAG 
721 GCCTCTTTTG AGAGGTTATT TGAAGATCTG TAAAGAAGTT AAGGAGAAGA GGCT GCAGCT 
781 TTTCAAAGAT TACTTTGTTG ATGAAAGAAA GAAGCTTTCA AAT AC C AAG A GCT CGGACAG 
841 CAATGCCCTA AAATGTGCGA TT GAT CACAT TCTTGAGGCT CAACAGAAGG G AGAGAT CAA 
901 TGAGGACAAC GTTCTTTACA TTGTTGAAAA CAT CAAT GTT GCTGCAATTG AAACAACATT 
961 AT GGT CAATT GAGTGGGGTA TCGCCGAGCT AGTCAACCAC CCTCACATCC AAAAGAAACT 
1021 GCGCGACGAG ATTGACACAG TTCTTGGACC AGGAGT GCAA GTGACTGAAC CAGACACCCA 
1081 CAAGCTTCCA TACCTTCAGG CT GT GAT CAA GGAGGCACTT CGTCTCCGTA TGGCAATTCC 
1141 TCTATTAGTC CCACACATGA ACCTTCACGA CGCAAAGCTT GGCGGGCTTG ATATTCCAGC 
1201 AGAGAGCAAA ATCTTGGTTA ACGCTTGGTG GTTAGCTAAC AACCCGGCTC AT T GGAAGAA 
1261 ACCCGAAGAG TTCAGACCCG AGAGGTTCTT TGAAGAGGAG AAGCATGTTG AGGCCAATGG 
1321 CAATGACTTC AGATATCTTC CGTTTGGCGT TGGTAGGAGG AGCTGCCCTG GAAT TAT ACT 
1381 TGCATTGCCA ATTCTTGGCA TCACTTTGGG ACGTTTGGTT CAGAACT TT G AGCTGTTGCC 
1441 TCCTCCAGGC CAGTCGAAGC TCGACACCAC AGAGAAAGGT GGACAGTTCA GT CT C CACAT 
1501 TTT GAAGCAT TCCACCATTG TGTTGAAACC AAGGTCTTTC TGAACTTTGT GATCTTATTA 
1561 ATTAAGGGGT T CT GAAGAAA TTTGATAGTG TTGG 



. NO. 248 

1 MDLLLLEKTL IGLFFAILIA LIVSKLRSKR FKLPPGPIPV PVFGNWLQVG DDLNHRNLTD 

61 YAKKFGDLFL LRMGQRNLW VSSPELAKEV LHTQGVEFGS RTRNWFDIF TGKGQDMVFT 

121 VYGEHWRKMR RIMTVPFFTN KWQQYRGGW EFEVASVI ED VKKNPESATN GIVLRRRLQL 

181 MMYNNMFRIM FDRRFESEDD PLFVKLKALN GERSRIAQSF EYNYGDFIPI LRPLLRGYLK 

241 ICKEVKEKRL QLFKDYFVDE RKKLSNTKSS DSNALKCAID HILEAQQKGE INEDNVLYIV 

301 ENINVAAIET TLWSIEWGIA ELVNHPHIQK KLRDEIDTVL GPGVQVTEPD THKLPYLQAV 

361 I KEALRLRMA IPLLVPHMNL HDAKLGGLDI PAESKILVNA WWLANNPAHW KKPEEFRPER 

421 FFEEEKHVEA NGNDFRYLPF GVGRRSCPGI ILALPILGIT LGRLVQNFEL LPPPGQSKLD 
481 TTEKGGQFSL HILKHSTIVL KPRSF 
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NAME D244-AD4 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 249 

1 AACATTTTGC AATATAGTTT TCCTAGTCAG TTCTAGCCTC CTTTTCCTTA GAAATAATGG 
61 ATTATCATAT TTCTTTCCAT TTTCAAGCTC TTTTAGGGCT TTTAGCCTTT GTGTTCTTGT 
121 CTATTATCTT AT GGAGAAGA ACACTCACTT CAAGAAAATT AGCCCCTGAA ATCCCAGGGG 
181 CAT GGCCTAT TATAGGCCAT CTTCGTCAGC TGAGTGGTAC T G AT AAGAAT ATCCCATTTC 
241 CCCGAATATT GGGCGCTTTG GCAGATAAAT ATGGACCTGT CTTCACACTG AGAATAGGGA 
301 TGTACCCCTA TTTGATTGTC AACAATTGGG AAGCAGCTAA GGATTGTCTC ACAACGCATG 
361 ATAAGGACTT CGCTGCCCGA CCAACTTCTA TGGCTGGTGA AAGCATCGGG TACAAGTATG 
421 CGAGGTTTAC TTAT GCTAAT TTTGGTCCTT ATTATAACCA AGTGCGCAAA CTAGCCCTAC 
481 AACATGTACC CTCGAGTACT AAACTCGAGA AAAT GAAAC A CAT AC GT GT T TCTGAATTGG 
541 AAACTAGCAT CAAAGAATTA TATTCTTTGA CGCTGGGCAA AAACAACATG CAAAAAGTGA 
601 ATATAAGTAA AT GGTTT GAA CAATTGACTT TAAACATAAT CGT GAAGACA ATTTGTGGCA 
661 AGAGATATAG CAACATAGAG GAGGATGAAG AGGCACAACG TTT CAGAAAG GCATTTAAGG 
721 GCATCATGTT TGTTGTAGGG CAAATTGTTT TAT AT GACGC AATTCCATTC CCATTGTTCA 
7 81 AATACTTTGA TTTCCAAGGT CATATACAAT TGATGAACAA AATTTATAAA GACTTAGATT 
841 CTATTCTTCA AGGATGGTTG GAT GAT CAT A TGATGAACAA GGATGTAAAC AATAAGGATC 
901 AAGAT GCCAT AGAT GC CAT G CTTAAGGTAA CACAACTTAA TGAATTCAAA GCCTATGGTT 
961 TTTCTCAGGC CACT GT GAT C AAGTC GACAG TCTTGAGTTT GATCTTAGAT GGAAAT GACA 
1021 CAACCGCTGT T CATTT GATA T GGGTAAT GT CCT TATTACT GAACAAT CCA CAT GTTAT GA 
1081 AACAAGGCCA AGAAGAGATA GACAT GAAAG TGGGTAAAGA GAGGT GGATT GAAGATACTG 
1141 ACATAAAAAA TTTAGT GTAC CTTCAGGCTA TCGTTAAAGA GACATTGCGC TTGTAT CCAC 
1201 CTGTTCCTTT TCTTTTACCA CAC GAAGCAG TGCAAGATTG -TAAAGT GACT GGTTACCACA 
1261 TTCCTAAAGG TACT CGT CTA TAT AT CAAT G CGTGGAAAGT ACATCGCGAT CCTGAAATTT 
1321 GGTCAGAGCC CGAAAAGTTT AT GCCCAATA GATTCTTGAC T AGCAAAG CA AATATAGATG 
1381 CTCGCGGTCA AAATTTTGAA TTTATACCGT TTGGTTCTGG GAGACGGTCA TGTCCAGGGA 
1441 TAGGTTTTGC GACT TTAGT G ACACATCTGA CTTTTGGTCG CTTGCTTCAA GGTTTTGATT 
1501 TTAGT AAGC C AT CAAAC AC G CCAATT GACA TGACAGAAGG CGTAGGCGTT ACTTTGCCTA 
1561 AGGTTAATCA AGTTGAAGTT CTAATTACGC CTCGTTTACC TTCTAAGCTT TATTTATTTT 
1621 GAAAGT GCAA AT CAT CAAT C ATGGCTTGAG TAATTAGTTA TACTTTAATA TGTTTCTC 

SEQ, ID. NO. 250 

1 MDYHISFHFQ ALLGLLAFVF LSIILWRRTL TSRKLAPEIP GAWP I I GHLR QLSGTDKNIP 
61 FPRILGALAD KYGPVFTLRI GMYPYLIVNN WEAAKDCLTT HDKDFAARPT SMAGESIGYK 
121 YARFTYANFG PYYNQVRKLA LQHVPSSTKL EKMKHIRVSE LETSIKELYS LTLGKNNMQK 
181 VNISKWFEQL TLNIIVKTIC GKRYSNIEED EEAQRFRKAF KGIMFWGQI VLYDAIPFPL 
241 FKYFDFQGHI QLMNKIYKDL DSILQGWLDD HMMN KDVNNK DQDAI DAMLK VTQLNEFKAY 
301 GFSQATVIKS TVLSLILDGN DTTAVHLIWV MSLLLNNPHV MKQGQEEIDM KVGKERWIED 
3 61 TDI KNLVYLQ AIVKETLRLY PPVPFLLPHE AVQDCKVTGY HIPKGTRLYI NAWKVHRDPE 
421 IWSEPEKFMP NRFLTSKANI DARGQNFEFI PFGSGRRSCP GIG FAT LVT H LTFGRLLQGF 
481 DFSKPSNTPI DMTEGVGVTL PKVNQVEVLI TPRLPSKLYL F 
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NAME D247-AH1 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 251 

1 TGATAATGCT CTTTCTACTC TTTGTAGCCC TTCCTTTCAT TCTTATTTTT CTTCTTCCTA 
• 61 AATTCAAAAA TGGTGGAAAT AACAGATTGC CACCAGGTCC TATAGGTTTA CCATTCATTG 
121 GAAATTT GCA TCAATATGAT AGTATAACTC CTCATATCTA TTTTTGGAAA CTTTCCAAAA 
181 AAT AT GGCAA AATCTTCTCA TTAAAACTTG CTT CTACTAA TGTGGTAGTA GTTTCTTCAG 
241 CAAAATTAGC AAAAGAAGTA TTGAAAAAAC AAGATTTAAT ATTTT GTAGT AGAC CAT CTA 
301 TTCTTGGCCA ACAAAAACTG TCTTATTATG GT C GT GAT AT TGCTTTTGCA CCTTATAATG 

3 61 ATTATT GGAG AGAAATGAGA AAAATTTGTG TTCTTCATCT TTTTAGTTTA AAAAAAGTTC 
421 AATTATTTAG TCCAATTCGT GAAGATGAAG TTTTTAGAAT GATTAAGAAA ATATCAAAAC 

4 81 AAGCTTCTAC TTCACAAATT ATTAATTTGA GTAATTTAAT GATTTCATTA ACAAGTACAA 
541 TTATTT GTAG AGTTGCTTTT GGTGTTAGGT TTGAAGAAGA AGCACAT GCA AGGAAGAGAT 
601 TTGATTTTCT TTTGGCCGAG GCACAAGAAA T GAT GGCTAG TTTCTTTGTA TCTGATTTTT 
661 TTCCCTTTTT AAGTTGGATT GATAAATTAA GTGGATTGAC ATATAGACTT GAGAGGAATT 
721 TCAAGGATTT GGATAATTTT TATGAAGAAC TCATTGAGCA ACATCAAAAT CCTAATAAGC 
7 81 CAAAATATAT GGAAGGAGAT ATTGTTGATC TTTTGCTACA ATTGAAGAAA GAGAAATTAA 
841 CACCACTTGA TCTCACTATG GAAGATATAA AAGGAATTCT CAT GAAT GT G TTAGTTGCAG 
901 GATCAGACAC TAGT GCAGCT GCTACTGTTT GGGCAATGAC AGCCTTGATA AAGAATCCTA 
961 AAGCCAT GGA AAAAGTTCAA TTAGAAATCA GAAAATCAGT TGGGAAGAAA GG CAT T GT AA 

1021 AT GAAGAAGA TGTCCAAAAC ATCCCTTATT TTAAAGCAGT GATAAAGGAA ATATTTAGAT 
1081 TGTATCCACC AGCTCCACTT TTAGTTCCAA GAGAAT C AAT GGAAAAAACC ATATTAGAAG 
1141 GTTAT GAAAT TCGGCCAAGA ACCATAGTTC AT GTTAAC GC TTGGGCTATA GCAAGGGATC 
1201 CT GAAAT AT G GGAAAAT CCA GATGAATTTA TACCTGAGAG ATTTTTGAAT AGCAGTACCG 

12 61 ATTACAAGGG TCAAGATTTT GAGTTACTTC CATTTGGTGC AGGCAGAAGA GGTTGCCCAG 
1321 GTATTGCACT TGGGGTTGCA T CCAT GGAAC TTGCTTTGTC AAATCTTCTT TATGCATTTG 

13 81 ATT GGGAGTT GC CTT AT GGA GTGAAAAAAG AAGACATCGA CACAAACGTT AGGCCT GGAA 
1441 TTGCCATGCA CAAGAAAAAC GAACTTTGCC TTGTCCCAAA AAATTATTTA TAAATTATAT 
15 01 TGGGACGTGG AT CT CAATTT AGTTCTGTGA GGTCAGC 



SEQ. ID. NO. 252 

1 MLFLLFVALP FILIFLLPKF KNGGNNRLPP 
61 GKIFSLKLAS TNWWSSAK LAKEVLKKQD 
121 WREMRKICVL HLFSLKKVQL FSPIREDEVF 
181 CRVAFGVRFE EEAHARKRFD FLLAEAQEMM 
241 DLDNFYEELI EQHQNPNKPK YMEGDIVDLL 
301 DTSAAATVWA MTALI KNPKA MEKVQLEIRK 
361 PPAPLLVPRE SMEKT I LEGY EIRPRTIVHV 
421 KGQDFELLPF GAGRRGCPGI ALGVASMELA 
4 81 MHKKNELCLV PKNYL 



GPIGLPFIGN LHQYDSITPH IYFWKLSKKY 
LIFCSRPSIL GQQKLSYYGR DIAFAPYNDY 
RMIKKISKQA STSQIINLSN LMISLTSTII 
ASFFVSDFFP FLSWIDKLSG LTYRLERNFK 
LQLKKEKLTP LDLTMEDIKG ILMNVLVAGS 
SVGKKGIVNE EDVQNIPYFK AVIKEIFRLY 
NAWAIARDPE IWENPDEFIP ERFLNSSTDY 
LSNLLYAFDW ELPYGVKKED IDTNVRPGIA 
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NAME D24 8-AA6 

ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 253 

1 CCAAAAT CAT GGCTCTATCT TT CAT ATT CA TATCCATAAC CCTAATTTTT CTAGTT CATA 
61 AACTCTACCA CCGTCTTAGA TTCAAACTAC CACCAGGTCC GCGGCCGTTA CCGGTGGTCG 
121 GAAACCT CTA CGACATAAAA CCGGT GAGAT TCCGGTGCTT TGCCGATTGG GCCAAAACTT 
181 ACGGTCCGAT TTTCTCAGTA TACTTTGGGT CACAGTTAAA TGTTGTGGTA ACAACAGCTG 
241 AAT TAGCTAA AGAAGTATTG AAAGAAAATG AC CAGAATTT AGCAGATAGA TTTAGGACTA 
301 GACCTGCAAA TAATTT GAGC AGAAATGGGA TGGATTTGAT TTGGGCTGAT TATGGGCCTC 
361 ATTATGTGAA AGTAAG GAAG CTCTGTAATC TTGAGCTTTT TACT C CT AAA AGACTTGAAG 
421 CTCTTAGACC TATTAGAGAA GAT GAAGTTA CTGCTATGGT TGAAAACATT TT CAAGGATT 
481 GTACTAAGCC T GAT AACAC A GGTAAAAGCT TGTTGATAAG AGAGTACTTA GGAT CAGT AG 
541 CAT T CAAC AA CATTACAAGG TTAACATTTG GGAAAAGGTT CAT GAACT CA AAAGGT GAGA 
601 T T GAT GAG C A AGGT CAAGAA TTCAAGGGTA TTGTCTCTAA TGGCAT CAAA ATTGGCGGAA 
661 AACTTCCCTT GGCAGAGTAT (GTTCCATGGC TCCGTTGGTT TTTCACAATG GAAAACGAGG 
721 CACTCGTGAA GCACTCTGCA CGTAGAGACC GGTTAACAAG AAT GAT CAT G GATGAACACA 
781 CACTGGCTCG CAAGAAAACT GGT GATACTA AGCAGCATTT TGTCGATGCA TTGCTTACTC 
841 TT CAGAAGCA GTATGATCTT AGTGATGACA CTGTTATTGG CCTCCTCTGG GAT AT GAT TA 
901 CAGCAGGAAT GGACACAACA AC CAT AACAG TGGAATGGGC AAT GGCAGAA CTAGTTAAGA 
961 ACCCAAGAGT GCAACTAAAA GCT CAAGAGG AGCTT GACAG GGTAATCGGA ACGGATCGAA 
1021 T CAT GT CAGA AACCGATTTC TCTAAACTTC CTTACCTACA ATGTGTAGCC AAAGAGGCTC 
1081 TAAGGTT GCA CCCTCCAACT CCTCTAATGC TTCCTCATAA GGCCAGTGCC AGTGTCAAAA 
1141 TTGGTGGTTA TGACATTCCT AAGGGGTCCA TCGTGCACGT GAACGTTTGG GCTGTCGCTC 
1201 GTGACCCAGC C GT GT G GAAG AACCCGTTGG AGTT CAGACC AGAGCGCTTC CTT GAGGAAG 
1261 AC GT T GACAT GAAGGGT CAC GACTAT CGGT TATTGCCCTT TGGTGCAGGA AGGCGTGTTT 
1321 GCCCCGGTGC ACAACTT GCT AT CAACTT GG TCACATCTAT GTTGGGTCAT TTGTTGCATC 
1381 ATTTTACATG GGCTCCGGCC CCGGGGGTTA ACCCGGAGGA TATTGACTTG GAGGAGAGCC 
1441 CT GGAACAGT AACTTACATG AAAAATCCAA TACAAGCTAT TCCAACTCCA AGATTGCCTG 
1501 CACACTT GTA TGGACGTGTG C CAGT GGAT A T GT AAAAC AT TTTGTTCTTT CCCTTTTTGG 
1561 TT AT AT GAT G AG 

SEQ. ID. NO. 254 

1 MALSFIFISI TLIFLVHKLY HRLRFKLPPG PRPLPWGNL YD I KPVRFRC FADWAKTYGP 
61 IFSVYFGSQL NWVTTAELA KEVLKENDQN LADRFRTRPA NNLSRNGMDL IWADYGPHYV 
121 KVRKLCNLEL FTPKRLEALR PIREDEVTAM VENIFKDCTK PDNTGKSLLI REYLGSVAFN 
181 NITRLTFGKR FMNSKGEIDE QGQEFKGIVS NGIKIGGKLP LAEYVPWLRW FFTMENEAIiV 
241 KHSARRDRLT RMIMDEHTLA RKKT GDTKQH FVDALLTLQK QYDLSDDTVI GLLWDMI TAG 
301 MDTTTITVEW AMAELVKNPR VQLKAQEELD RVIGTDRIMS ETDFSKLPYL QCVAKEALRL 
3 61 HPPTPLMLPH KASASVKIGG YDIPKGSIVH VNVWAVARDP AVWKNPLEFR PERFLEEDVD 
421 MKGHDYRLLP FGAGRRVCPG AQLAINLVTS MLGHLLHHFT WAPAPGVNPE DIDLEESPGT 
481 VTYMKNPIQA IPTPRLPAHL YGRVPVDM 
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NAME D249-AE8 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 255 

1 AATCACTAAT TTTCATGTAC TCTCATAGGT CAAAAGTTTC AACCAAAATC ATGGCTCTAT 
. 61 CCTTCATATT CATATCCATA AC C CTAATT T TTCTAGTTCA TAAACT CT AC CACCGTCTTA 
121 GATTCAAACT AC CACCAGGT CCGCGGCCGT TACCGGTGGT CGGAAACCTC TACGACATAG 
181 AACCGGTGAG ATTCCGGTGC TTTGCCGATT GGGCCAAAAC TTACGGTCCG ATTTTCTCAG 
241 TATACTTTGG GTCACAGTTA AATGTTGTGG TAACAACAGC TGAATTAGCT AAAGAAGTAT 
301 TGAAAGAAAA TGACCAGAAT TTAGCAGATA GAT T TAGGAC TAGACCTGCA AATAATTTGA 
361 GCAGAAATGG GATGGATTTG ATTTGGGCTG ATTATGGGCC TCATTATGTG AAAGTAAGGA 
421 AGCTCTGTAA TCTTGAGCTT TTTACTCCTA AAAGACTTGA AGCT CTTAGA CCTATTAGAG 
481 AAGAT GAAGT TACTGCTATG GTT GAAAACA TTTTCAAGGA TTGTACTAAG CCT GATAACA 
541 CAGGTAAAAG CTTGTTGATA AGAGAGTACT TAGGATCAGT AGCATT CAAC AACATTACAA 
601 GGTTAACATT TGGGAAAAGG TT CAT GAACT CAAAAGGTGA GATTGATGAG CAAGGT CAAG 
661 AATT CAAGGG TATTGTCTCT AATGGCATCA AAATTGGCGG AAAACTTCCC TT GGCAGAGT 
721 ATGTTCCATG GCTCCGTTGG TTTTTCACAA TGGAAAACGA GGCACTCGTG AAGCACTCTG 
781 CAC GTAGAGA CCGGTTAACA AGAAT GAT C A TGGATGAACA CACACT GGCT CGCAAGAAAA 
841 CT GGT GAT AC TAAGCAGCAT TTTGTCGATG CATTGCTTAC T CT T CAGAAG CAGTATGATC 
901 TTAGTGATGA CACT GTTATT GGCCTCCTCT GGGATAT GAT TACAGCAGGA AT GGACACAA 
961 CAACCATAAC AGTGGAATGG GCAATGGCAG AACTAGT TAA GAACC CAAGA GTGCAACTAA 
1021 AAGCT CAAGA GGAGCTT GAC AGGGTAATCG GAACGGATCG AATCATGTCA GAAACCGATT 
1081 TCTCTAAACT TCCTTACCTA CAATGTGTAG CCAAAGAGGC TCTAAGGTTG CACCCTCCAA 
1141 CTCCTCTAAT GCTTCCTCAT AGGGCCAGTG CCAGTGTCAA AATTGGTGGT TATGACATTC 
1201 CTAAGGGGTC CAT CGT GCAC GTGAACGTTT GGGCTGTCGC TCGTGACCCA GCCGTGTGGA 
1261 AGAACCCGTT GGAGTT CAGA CCAGAGCGCT TCCTTGAGGA AGACGTT GAC AT GAAGGGT C 
1321 AC GACT AT C G GTTATT GCCC TTTGGTGCAG GAAGGCGTGT TTGCCCCGGT GC AC AACT T G 
1381 CT AT C AACT T GGTCACATCT ATGTTGGGTC ATTTGTTGCA TCATTTTACA TGGGCTCCGG 
1441 CCCCGGGGGT TAACCCGGAG GATATTGACT TGGAGGAGAG CCCTGGAACA GTAACT TACA 
1501 TGAAAAATCC AATACAAGCT ATTCCAACTC CAAGATT GC C TG CACACT TG TAT GGACGT G 
1561 TGCCAGTGGA TAT GTAAAAC 

SEQ. ID. NO. 256 

1 MYSHRSKVST KIMALSFIFI SITLIFLVHK LYHRLRFKLP PGPRPLPWG NLYDIEPVRF 

61 RCFADWAKTY GPIFSVYFGS QLNVWTTAE LAKEVLKEND QNLADRFRT R PANNLSRNGM 

121 DLIWADYGPH YVKVRKLCNL ELFTPKRLEA LRPIREDEVT AMVENIFKDC TKPDNTGKSL 

181 LI REYLGSVA FNNITRLTFG KRFMNSKGEI DEQGQEFKGI VSNGIKIGGK LPLAEYVPWL 

241 RWFFTMENEA LVKHSARRDR LTRMIMDEHT LARKKTGDTK QHFVDALLTL QKQYDLSDDT 

301 VIGLLWDMIT AGMDTTTITV EWAMAELVKN PRVQLKAQEE LDRVIGTDRI MSETDFSKLP 

361 YLQCVAKEAL RLHPPTPLML PHRASASVKI GGYDIPKGSI VHVNVWAVAR DPAVWKNPLE 

421 FRPERFLEED VDMKGHDYRL LPFGAGRRVC PGAQLAINLV TSMLGHLLHH FT WAPAP GVN 

4 81 PEDIDLEESP GTVTYMKNPI QAIPTPRLPA HLYGRVPVDM 
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NAME D25 0-AC11 

ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 257 

1 ATAATGCTCT TTCTACTCTT TGTAGCCCTT CCTTTCATTC TTATTTTTCT TCTTCCTAAA 

61 TTCAAAAATG GTGGAAATAA CAGATTGCCA CCAGGTCCTA TAGGTTTACC ATTCATTGGA 

121 AATTTGCATC AATATGATAG TATAACTCCT CATAT CT ATT TTTGGAAACT TTCCAAAAAA 

181 TAT GGCAAAA TCTTCTCATT AAAACTTGCT TCTACTAATG TGGTAGTAGT TTCTTCAGCA 

241 AAAT TAGCAA AAGAAGTATT GAAAAAACAA GATT TAATAT TTT GTAGTAG ACCATCTATT 

301 CTTGGCCAAC AAAAACTGTC TTATTAT GGT C GT GAT AT T G CTTTTGCACC TTATAAT GAT 

361 TAT T GGAGAG AAATGAGAAA AATTTGTGTT CTTCATCTTT TTAGTTTAAA AAAAGTTCAA 

421 TTATTTAGTC CAATTCGTGA AGATGAAGTT TTTAGAATGA TTAAGAAAAT ATCAAAACAA 

481 GCTTCTACTT CACAAATTAT TAATTT GAGT AATTTAAT GA TTTCATTAAC AAGTACAATT 

541 ATT T GTAGAG TTGCTTTTGG TGTTAGGTTT GAAGAAGAAG CACATGCAAG GAAGAGATTT 

601 GATTTTCTTT TGGCCGAGGC ACAAGAAAT G ATGGCTAGTT TCTTTGTATC TGATTTTTTT 

661 CCCTTTTTAA GTTAGATTGA CAAATTAAGT GGATT GACAT ATAGACTT GA GAGGAATTTC 

721 AAGGATTTGG ATAATTTTTA TGAAGAACTC ATTGAGCAAC AT CAAAAT CC TAATAAGCCA 

781 AAATATATGG AAGGAGATAT TGTTGATCTT TTGCTACAAT T GAAGAAAGA GAAATTAACA 

841 CCACTTGATC TCACTATGGA AGATATAAAA GGAATTCTCA TGAATGTGTT AGTTGCAGGA 

901 TCAGACACTA GTGCAGCTGC TACTGTTTGG GCAAT GACAG CCTTGATAAA* GAATCCTAAA 

961 GCCATGGAAA AAGTTCAATT AGAAATCAGA AAATCAGTTG GGAAGAAAGG CATT GTAAAT 

1021 GAAGAAGATG TCCAAAACAT CCCTTATTTT AAAGCAGTGA TAAAGGAAAT ATTTAGATTG 

1081 TAT C C AC C AG CTCCACTTTT AGTT CCAAGA GAATCAATGG AAAAAACCAT ATTAGAAGGT 

1141 TATGAAATTC GGCCAAGAAC CATAGTTCAT GTTAACGCTT GGGCTATAGC AAGGGATCCT 

1201 GAAATATGGG AAAAT CCAGA T GAATT TATA CCT GAGAGAT TTTT GAATAG CAGTATCGAT 

1261 TACAAGGGTC AAGATTT T GA GTTACTTCCA TTTGGTGCAG GCAGAAGAGG TTGCCCAGGT 

1321 ATTGCACTTG GGGTTGCATC CAT GGAACTT GCTTTGTCAA ATCTTCTTTA TGCATTTGAT 

1381 TGGGAGTTGC CTTAT GGAGT GAAAAAAGAA GACAT C G AC A CAAACGTTAG GCCTGGAATT 

1441 GCCAT GCACA AGAAAAACGA ACTTTGCCTT GTCCCAAAAA AATTATTTAT AAATTATATT 
1501 GGGACGTGGA TCTCATGCTA GTTCTGTGCG GT CAGCTAAG CTTA 

SEQ. ID. NO. 258 

1 MLFLLFVALP FILIFLLPKF KNGGNNRLPP GPIGLPFIGN LHQYDSITPH IYFWKLSKKY 
61 GKIFSLKLAS TNWWSSAK LAKEVLKKQD LIFCSRPSIL GQQKLSYYGR DIAFAPYNDY 
121 WREMRKICVL HLFSLKKVQL FSPIREDEVF RMIKKISKQA STSQIINLSN LMISLTSTII 
181 C RVAFGVRFE EEAHARKRFD FLLAEAQEMM ASFFVSDFFP FLS . IDKLSG LTYRLERNFK 
241 DLDNFYEELI EQHQNPNKPK YMEGDIVDLL LQLKKEKLTP LDLTMEDIKG ILMNVLVAGS 
301 DTSAAATVWA MTALIKNPKA MEKVQLEIRK SVGKKGIVNE EDVQNIPYFK AVIKEIFRLY 
361 PPAPLLVPRE SMEKTILEGY EIRPRTIVHV NAWAIARDPE IWENPDEFIP ERFLNSSIDY 
421 KGQDFELLPF GAGRRGCPGI ALGVASME LA LSNLLYAFDW ELPYGVKKED IDTNVRPGIA 
481 MHKKNELCLV PKKLFINYIG TWISC 
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NAME D259-AB9 
ORGANISM NI COT I ANA TABACUM 



1 CACATTGAGT CCTCTCCCAA ATCACTGATT CACCACCAAA AGTACCAACA ATTCAATGGA 
6i Sgtacaaac ttgactacat ATGCAGCAGT ATTTCTTGAT actctgtttc TTTTGTTCCT 
121 T^TCCAAACTT CTTCGCCAGA GGAAACTCAA TTTACCTCCA GGCCCAAAAC CATGGCCGAT 

III S^SgaS? ttaaacctta ttggcaatct tcctcatcgc tcaatccacg aactctccct 
241 caag?a5gga cccgttatgc aactccaatt cgggtctttc cccgttgtag ttggatcctc 
lol cotcSaaS gctaagattt tcctcaaatc catggatatt aactttgtag gcaggcctaa 

361 AACGGCTGCC GGAAAATACA CAACGTACAA TTATTCCGAT ATTACAT GGT CTCCTTACGG 
421 ACCATATTGG CGCCAGGCAC GTAGGATGTG CCTAACGGAA TTATTCAGCA CGAAACGTCT 

48i SgattSta? gagtatattc gggctgagga gttgcattct cttctccata atttgaacaa 
til aSSSggS aaa™ttg tgttgaaaga ttatttgacg acgttgagtt taaatgttat 
lol TAGCAGGATG gtactgggga aaaggtattt ggacgaatcc gagaactcgt tcgtgaatcc 
^?gSatt? aagaagatgt tggacgaatt gtttttgcta aatggtgtac ttaatattgg 

ill ATTTCAT GGA TTTGCAAGGT TATGTTAAGA GGATGAAAGT 

ill AGTGAGCAAG AAATTCGACA AGTTTTTAGA GCATGTTATT GATGAGCATA ACATTAGGAG 
III JSSSSSiS gSSatg TTGCTAAGGA TATGGTGGAT GTTTTGTTGC agcttgctga 
901 TGATCCGAA.G TTGGAAGTTA AGCTGGAGAG ACATGGAGTC AAAGCATTCA CT CAGGATAT 
III GCTGGCTGGT GGAaSgAGA GTTCAGCAGT GACAGTGGAG TGGGCAATTT CAGAGCTGCT 

1021 aaagaagccg gagattttca aaaaggctac agaagaattg gatcgagtaa ttgggcagaa 

1081 TAGATGGGTA CAAGAAAAGG ACATT CCAAA TCTTCCTTAC ATAGAGGCAA TAGTCAAAGA 

i?4i GACTA^GCGA CTGCACCCCG tggcaccaat gttggtgcca cgtgagtgtc gagaagatat 

12^ TAAGGTAGCA GGCTACGACG TTCAGAAAGG AACTAGGGTT CTCGTGAGTG TAT GGACTAT 

llll t^aSagac cctacattgt gggacgagcc tgaggtgttc aagccggaga gattccatga 
SaStccata gatgttaaag gacatgatta tgagcttttg ccatttggag cggggagaag 
llll aa^ctSScg ggttatagct tggggctcaa ggtgattcaa gctagcttag ctaatcttct 
llll iSSSSSS aactggtcat tgcctgataa tatgactcct gaggacctca acatggatga 
liol SSctttggg ctctctacac ctaaaaaatt tccacttgct actgtgattg agccaagact 

1561 TTCACCAAAA CTTTACTCTG TTTGATTCAG CAGTTCTATG GTTCCGTCAA GATAG 
SEQ. ID. NO. 260 AVFLDTLFLL FLSKLLRQRK LNLPPGPKPW PIIGNLNLIG NLPHRSIHEL 

61 slSSSS S?gsf™ ssvemak£fl KSMDINFVGR pktaagkytt ynysditwsp 
121 ygSrqarr mcltelfstk rldsyeyira eelhsllhnl nkisgkpivl kdylttlsln 
III ^sbmv£gkr yldesensfv npeefkkmld elfllngvln igdsipwidf mdlqgyvkrm 
241 kwskkfdkf lehvidehni rrngvenyva kdmvdvllql addpklevkl erhgvkaftq 

301 DMLAGGT ESS AVTVEWAISE LLKKPEIFKK ATEELDRVIG QNRWVQEKDI PNLPYIEAIV 
III KETMRLHPVA PMLVP REC RE DIKVAGYDVQ. KGTRVLVSVW TIGRDPTLWD EPEVFKPERF 
42^ DYELLPFGAG RRMCPGYSLG LKVIQASIAN LLHGFNWSLP DNMTPEDLNM 

4 81 DEIFGLSTPK KFPLATVIEP RLSPKLYSV 



WO 2004/035745 
BIG. 131 



72/111 



PCT/US2003/032722 



NAME D218A-AC2 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 261 

1 CTTCTTCCTT CCTAACTAAA AAT GGAGATT CAGTTTTCTA ACTTAGTTGC ATTCTTGCTC 
61 TTTCTCTCCA GCATCTTTCT T GTATT C AAA AAATGGAAAA CCAGAAAACT AAATTTGCCT 
121 CCTGGTCCAT GGAAATTACC TTTTATTGGA AGTTTACACC ATTTGGCTGT GGCAGGTCCA 
181 CTTCCTCACC ATGGCCTAAA AAATTTAGCC AAACGCTATG GTCCTCTTAT GCATTTACAA 

2 41 CTTGGACAAA TTCCTACACT C GT CAT AT CA TCACCTCAAA TGGCAAAAGA AGTACTAAAA 
301 ACTCACGACC TCGCTTTTGC CACTAGACCA AAGCTTGTCG TGGCCGACAT CATTCACTAC 

3 61 GACAGCACGG ACATAGCACT TTCGCCATAC GGT GAATACT GGAGACAAAT TCGTAAAATT 
421 TGCATATTGG AACTCTTGAG TGCCAAGATG GTCAAGTTTT TTAGCT C GAT TCGCCAAGAT 

4 81 GAGCTCTCGA AGATGGTTTC AT CTATACGA ACGACGCCCA ATCTTCCAGT CAAT CTTACC 
541 GACAAGATTT TTTGGTTTAC GAGTTCGGTA ATTTGTAGAT CAGCTTTAGG GAAGAT AT GT 
601 GGTGACCAAG ACAAATTGAT CATTTTTATG AGGGAAATAA TATCATTGGC AGGT GGATTT 
661 AGTATTGCTG ATTTTTTCCC TACAT GGAAA AT GAT T CAT G AT AT T GAT GG TTCAAAATCT 
721 AAACTGGTGA AGGCACATCG TAAGATT GAT GAAATTTTGG AAAAT GT GGT AAATGAGCAC 
781 AAACAGAATC GAG CAGAT GG TAAAAAGGGT AATGGTGAAT T T GGT GGAGA AGATCTGATT 

8 41 GATGTTTTGT T AAGAGT TAG AGAAAGT GGA GAAGTTCAAA TTCCAATCAC AGAT GACAAT 
901 ATCAAATCAA TATTAATCGA CATGTTCTCT GCCGGATCGG AAAC AT CAT C GACAACTATA 

9 61 ATTTGGGCAT TAGCT GAAAT GAT G AAG AAA CCAAGTGTTT TAGCAAAGGC ACAAGCT GAA 
1021 GT GAGCCAAG CTTT GAAGGG GAAG AAAAT T AGTTTTCAAG AGATTGATAT TGATAAGCTA 
1081 AAGTATTTGA AGTTAGTGAT CAAAGAAACT TTAAGAATGC ACCCTCCAAT TCCTCTGTTA 
1141 GTCCCTAGAG AATGTATGGA AGATACAAAG ATTGATGGTT ACAAT AT AC C TTT CAAAACA 
12 01 AGAGTCATTG TTAATGCATG GGCAATTGGA CGAGATCCTC AAAGTTGGGA T GAT C CT GAA 
12 61 AGCTTTACGC CAGAGAGATT T GAGAAT AAT TCTATTGATT TTCTT GGAAA T CAT CAT C AA 
1321 TTTATTCCAT TTGGTGCAGG AAGAAGGATT TGTCCTGGAA TGCTATTTGG TTTAGCTAAT 
1381 GTTGGACAAC CTTTAGCTCA GTTACTTTAT CACTTCGATT GGAAACTCCC TAATGGACAA 

14 41 ACT C AC CAAA ATTTCGACAT GACTGAGTCA CCTGGAATTT CTGCTACAAG AAAGGAT GAT 

15 01 CTTATTTTGA TTGCCACTCC TGCTCATTCT T GATTAAGT A TTGCTGCTTT TCTATTGGAG 
15 61 AATTTTCAAA ATTCATCCAC AATATATAGT GTTTGCTAGA GTTGGTTAGC 

SEQ. ID. NO. 262 

1 MEIQFSNLVA FLLFLSSIFL VFKKWKT RKL NLPPGPWKLP FIGSLHHIxAV AGPLPHHGLK 
61 NLAKRYGPLM HLQLGQIPTL VISSPQMAKE VLKTHDLAFA T RP KL WAD I IHYDSTDIAL 

121 SPYGEYWRQI RKICILELLS AKMVKFFS S I RQDELSKMVS SIRTTPNLPV NLTDKIFWFT 

181 SSVICRSALG KICGDQDKLI IFMREIISLA GGFSIADFFP TWKMIHDIDG SKSKLVKAHR 

241 KIDEILENW NEHKQNRADG KKGNGEFGGE DLI DVLLRVR ESGEVQIPIT DDNIKSILID 

301 MFSAGSETSS TTI IWALAEM MKKPSVLAKA QAEVS QALKG KKISFQEIDI DKLKYLKLVI 

3 61 KETLRMHPPI PLLVPRECME DTKIDGYNIP FKTRVIVNAW AIGRDPQSWD DPESFTPERF 
421 ENNSIDFLGN HHQFIPFGAG RRICPGMLFG LANVGQPLAQ LLYHFDWKLP NGQTHQNFDM 

4 81 TESPGISATR KDDLILIATP AHS 
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NAME D210-RD4 
ORGANISM NICOTIANA TABACUM 
SEQ. ID. NO. 263 

1 CT T T CAT CAT AT GGCAT GAA AT GGGAAAT G CT C ACAACAG CAAAATT GCA GCAATCTGTT 
61 TGATAATTTT CTTGGTATAT AAAGCAT GGG AATTGTTGAA GTGGATATGG ATTAAGCCAA 
121 AGAAACT GGA GAGTTGCCTC AGAAAACAGG GACT CAAAGG AAATtCCTAC GGGCTATTCT 
181 AT GGAGATAT GAAAGAATtG TCCAAAAGTC TCAAGGAAAT CAATT CAAAG CC CAT CATC A 
241 AT CTAT CAAA TGAAGTAGCC CCAAGAATCA TTCCTTATtA TCTTGAAATC AT C C AAAAAT 
3 01 AT GGTAAAAG ATGTTTTGTT TGGCAAGGAC CAACCCCCGC AATATTAATA ACAGAGCCAG 

3 61 AATTAATAAA GGAGATAT T T GGTAAGAACT ATGTTTTTCA GAAGC CTAAT AATCCCAACC 
421 CACTGACCAA GTTATTGGCT CGAGGTGTTG TAAGCTACGA GGAAGAAAAA TGGGCAAAAC 

4 81 ACAGAAAGAT CTTAAATCCT GCCTTTCATA TGGAGAAGTT GAAGCAT AT G CTACCAGCAT 
541 TTTACTTGAG CT GTAGT GAG AT GCT GAACA AAT GGGAGGA GATTATCCCA GTAAAAGAAT 
601 CAAATGAGTT GGACATTTGG CCTCATCTTC AAAGAAT GAC AAGTGATGTG ATTTCTCGTG 
661 CTGCCTTTGG TAGTAGCTAC GAAGAAGGAA GAAGAAT AT T TGAACTTCAA GAAGAACAAG 
721 CTGAGTATCT AACGAAGACA TT CAATT CAG TTTATATCCC AGGTTCCAGA TTTTTTCCCA 

7 81 AT AAAAT GAA CAAAAGAATG AAAGAAT GT G AAAAGGAAGT ACGAGAAACA ATTACGTGTC 

8 41 TAATT GACAA CAGATTAAAG GCAAAAGAAG AAGGCAATGG CAAGGCCCTC AATGATGACC 
901 TATTGGGTAT ATTATTAGAG TCAAATTCTA T AGAAATT GA AGAAC AT G GT AACAAGAAGT 
961 T T GGAAT GAG TATACCT GAA GTAATTGAAG AGT GCAAATT ATTCTATTTT GCTGGCCAAG 

1021 AGACTACATC AGTATT GCTT GT GT GGACAC TGATTTTGTT AGGGAGAAAt cCAGAATGGC 
1081 AGGAACGTGC TAGAGAGGAA GTTTTTCAAG CCTTTGGAAG TGATAAACCA ACTTTTGACG 
1141 AAT TAT AT C G CTTGAAAATT GT GAC GAT GA TTTTGTACGA GT CTTTAAGG TTATATCCAC 
1201 CAATAGCAAC TCGTACTCGA AGGACTAATG AAGAAACAAA ATTAGGGGAA CTAGATTTAC 

12 61 CAAAGGGTGC ACTGCTCTTT ATACCAACAA T CT T AT TACA T CT T GAC AG G GAAATTTGGG 
1321 GTGAAGATGC AGAT GAGTT C AATCCGGAGA GATTTAGCGA AGGGGT GGCA AAGGCAACAA 

13 81 AGGGGAAAAT GACATAT TTT CCATTTGGTG CAGGACCGCG AAAAT GCATT GGGCAAAACT 
1441 TCGCGATTTT GGAAGCAAAA AT GGCTATAG CTATGATTCT ACAACGCTTC TCCTTCGAGC 
1501 TCTCTCCATC TTATACACAC TCTCCATACA CTGTGGTCAC TTTGAAACCC AAATATGGTG 
15 61 CTCCCCTAAT AAT GCACAGG CT GTAGT CCT GT GAGAAT AT GCTATCCGAG G 

SEQ. ID. NO. 264 

1 MGNAHNS KI A AICLIIFLVY KAWELLKWIW IKPKKLESCL RKQGLKGNSY GLFYGDMKEL 
61 SKSLKEINSK PIINLSNEVA PRIIPYYLEI IQKYGKRCFV WQGPTPAILI TEPELIKEIF 
121 GKNYVFQKPN NPNPLTKLLA RGWSYEEEK WAKHRKILNP AFHMEKLKHM LPAFYLSCSE 
181 MLNKWEEIIP VKESNELDIW PHLQRMTSDV ISRAAFGSSY EEGRRIFELQ EEQAEYLT KT 
241 FNSVYIPGSR FFPNKMNKRM KECEKEVRET ITCLIDNRLK AKEEGNGKAL NDDLLGILLE 
301 SNSIEIEEHG NKKFGMSIPE VIEECKLFYF AGQETTSVLL WTLILLGRN PEWQERAREE 
361 VFQAFGSDKP TFDELYRLKI VTMILYESLR LYPPIATRTR RTNEETKLGE LDLPKGALLF 
421 IPTILLHLDR EIWGEDADEF NPERFSEGVA KATKGKMTYF PFGAGPRKCI GQNFAI LEAK 
481 MAIAMILQRF SFELSPSYTH SPYTWTLKP KYGAPLIMHR L 
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NAME D233-AG7 

ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 265 

1 CT CAT TAT CC AT CACCTAAA AT GGAGAAT T CTTGGGTTTT TCTAGCCTTG GCAGGGCTAT 
61 CTGCATTAGC TTTTCTCTGT AAAATAATCA CCTGTCGAAG ACCGGTTAAC C G GAAAAT AC 
121 CACCAGGTCC AAAACCATGG C C CAT CATT G GCAATTT GAA CCTACTTGGT CCTATACCAC 
181 AT CAAT CTTT TGACTTGCTT TCCAAAAAAT AT GGAGAGTT GATGCTGCTG AAATTTGGCT 
241 CCAGGCCAGT TCTTGTTGCT TCATCTGCTG AAATGGCAAA ACAGTTTTTA AAAGTACAT G 
301 AT GCTAATTT CGCCTCCCGT CCTAT GCTAG CTGGTGGAAA GTATACAAGC TATAACTATT 
361 GTGACATGAC ATGGGCACCC TATGGTCCCT ATTGGCGCCA AGCACGACGA ATTTACCTTA 
421 ACCAGATATT TACT CCGAAA AGGCTAGACT CGTTCGAGTA CATTCGTGTT GAAGAAAGGC 
481 AGGCCTTGAT TTCCCAGCTG AATTCCCTTG CTGGAAAGCC ATTTTTTCTC AAAG AC CAT T 
541 TGTCGCGATT TAGCCTCTGC AGCAT GACAA GGATGGTTTT GAGCAACAAG TATTTTGGTG 
601 AAT CAACAGT TAGAGTAGAA GATTT GCAGT AC CT GGTAGA TCAATGGTTC TTACTTAATG 
661 GTGCTTTCAA CATT GGAGAT TGGATTCCAT GGCTCAGCTT CTTGGACCTA CAAGGCTATG 
721 TGAAACAAAT GAAGGCTTTG AAAAGAACTT TT GATAAGT T C C AC AAC AT T GTGCTAGATG 
781 AT CACAGGGC TAAGAAGAAT GCAGAGAAGA ACTTTGTCCC AAAAGACATG GT T GAT GT CT 
841 TGTTGAAGAT GGCT GAAGAT CCTAATCTGG AAGT CAAACT CACTAAT GAC T GT GT C AAAG 
901 G GT T AAT G C A G GATTT ACT A ACT GGAGGAA CAGATAGCTT AAC AG CAGC A GTGCAATGGG 
961 CATTTCAAGA ACTTCTTAGA CAGCCAAGGG TTATTGAGAA GGCAACCGAA GAGCTTGACC 
1021 GGATTGTCGG GAAAGAGAGA TGGGTAGAAG AGAAAGATT G CTCGCAGCTA TCTTACGTTG 
1081 AAGCAAT CCT CAAGGAAACA CTAAGGT TAC AT C CT CTAGG AACT AT GCT A GC AC C G CAT T 
1141 GT GCTATAGA AGAT T GT AAC GTGGCTGGTT AT GAC AT AC A GAAAGGAACG ACCTTTCTGG 
1201 TGAATGTTTG GAC CATT GGA AGGGACCCAA AAT ACT GGGA TAGAGCACAA GAGTTTCTCC 

12 61 CCGAGAGATT TTTAGAGAAC GACATT GATA TGGACGGACA TAACTTT GCT TTCTTGCCAT 
1321 TTGGCTCGGG GCGAAGGAGG TGCCCTGGCT AT AGCCTT GG ACTTAAGGTT AT C CGAGTAA 

13 81 CATTAGCCAA CATGTTGCAT GGATTCAACT GGAAATTACC TGAAGGTATG AAGC CAGAAG 
1441 AT AT AAGT GT GGAAGAACAT TAT GGGCT C A CTACACATCC TAAGTTTCCT GTTCCTGTGA 
1501 TCTTGGAATC TAGACTTT CT TCAGATCTCT ATTCCCCCAT CACTTAATCC T AAGT GCT TC 
15 61 CTATTATAGC 

SEQ. ID. NO. 266 

1 MEN S WVFLAL AGLSALAFLC KIITCRRPVN RKIPPGPKPW PIIGNLNLLG PIPHQSFDLL 
61 SKKYGELMLL KFGS RPVLVA SSAEMAKQFL KVHDAN FAS R PMLAGGKYTS YNYCDMTWAP 
121 YGPYWRQARR IYLNQIFTPK RLDSFEYIRV EERQALISQL NSIAGKPFFL KDHLSRFSLC 
181 SMTRMVLSNK YFGESTVRVE DLQYLVDQWF LLNGAFN I GD WIPWLSFLDL QGYVKQMKAL 
241 KRTFDKFHNI VLDDHRAKKN AEKNFVPKDM VDVLLKMAED PNLEVKLTND CVKGLMQDLL 
301 TGGTDSLTAA VQWAFQELLR QPRVIEKATE ELDRIVGKER WVEEKDCSQL SYVEAILKET 
361 LRLHPLGTML APHCAIEDCN VAG YD I QKGT TFLVNVWTIG RDPKYWDRAQ EFLPERFLEN 
421 DIDMDGHNFA FLPFGSGRRR CPGYSLGLKV IRVTLANMLH GFNWKLPEGM KPEDISVEEH 
4 81 YGLTTHPKFP VPVILESRLS SDLYSPIT 
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NAME D257-AE4 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID- NO. 267 

1 CACATTGAGT CCTCTCCCAA ATCACTGATT CACCACCAAA AGTACCAACA ATTCAATGGA 
61 AGGTACAAAC TTGACTACAT ATGCAGCAGT ATTTCTTGAT ACTCTGTTTC TTTTGTTCCT 
121 TTCCAAACTT CTTCGCCAGA GGAAACTCAA TTTACCTCCA GGCCCAAAAC CAT GGCC GAT 
181 CAT C GGAAAC TTAAAC CTTA TTGGCAATCT TCCTCATCGC TCAATCCACG AACTCTCCCT 
241 CAAGTACGGA CCCGTTATGC AACTCCAATT CGGGTCTTTC CCCGTTGTAG TTGGATCCTC 
301 CGTCGAAATG GCTAAGATTT TCCTCAAATC CATGGATATT AACTTT GTAG GCAGGCCTAA 
3 61 AACGGCTGCC GGAAAATACA CAACGTACAA TTATTCCGAT ATTACAT GGT CTCCTTACGG 
421 AC CAT ATT GG CGCCAGGCAC GTAGGATGTG CCTAACGGAA TTATT CAGCA CGAAACGTCT 
481 CGATT CATAC GAGT AT ATT C GGGCT GAGGA GTTGCATTCT CTTCTCCATA ATTT GAACAA 
541 AATATCAGGG AAACCAATTG TGTTGAAAGA TTATTTGACG ACGTT GAGTT TAAAT GTT AT 
601 TAGCAGGATG GTACTGGGGA AAAGGTATTT GGACGAATCC GAGAACTCGT TCGTGAATCC 
661 TGAGGAATTT AAGAAGATGT TGGACGAATT GTTTTTGCTA AATGGTGTAC TTAATATTGG 
721 AGATTCAATT C CAT GGATT G ATTTCATGGA TTTGCAAGGT TATGTTAAGA GGAT GAAAGT 
781 AGT GAGCAAG AAATT C GACA AGTTTTTAGA G CAT GT TAT T GAT GAGCATA ACATTAGGAG 
841 AAAT GGAGT G G AGAAT TAT G TTGCTAAGGA TAT GGT GGAT GTTTTGTTGC AGCTTGCTGA 
901 TGAT CCGAAG TT GGAAGTTA AGCTGGAGAG AC AT G GAGT C AAAGCATTCA CT CAG GAT AT 
961 GCTGGCTGGT GGAACCGAGA GTTCAGCAGT GACAGTGGAG TGGGCAATTT CAGAGCTGCT 
1021 AAAGAAGCCG GAGATTTTCA AAAAGGCTAC AGAAGAATTG GAT CGAGTAA TTGGGCAGAA 
1081 TAGAT GGGTA CAAGAAAAGG ACATT C CAAA TCATCcTTAC ATAGAGGCAA TAGTCAAAGA 
1141 GACTATGCGA CTGCACCCCG TGGCACCAAT GTTGGTGCCA CGTGAGTGTC GAGAAGATAT 
1201 TAAGGTAGCA GGCTACGACG TT CAGAAAGG AACTAGGGTT CTCGTGAGTG TAT GGACTAT 
1261 TGGAAGAGAC C CTACATT GT GGGACGAGCC TGAGGTGTTC AAGCCGGAGA GATTCCATGA 
1321 AAAGT CCATA GAT GTTAAAG GACAT GATTA TGAGCTTTTG CCATTTGGAG CGGGGAGAAG 
1381 AATGTGCCCG GGTTATAGCT TGGGGCTCAA GGTGATTCAA GCTAGCTTAG CTAATCTTCT 
1441 ACAT GGATT T AACTGGTCAT TGCCTGATAA TATGACTCCT GAGGACCT CA ACATGGATGA 
1501 GATTTTTGGG CTCTCTACAC CTAAAAAATT TCCACTTGCT ACT GT GATT G AGCCAAGACT 
1561 TT CACCAAAA CTTTACTCTG TTT GATT CAG GAGTT CTAT G GATCCGTCAA GATAGAC 

SEQ. ID. NO. 268 

1 MEGTNLTTYA AVFLDTLFLL FLSKLLRQRK LNLPPGPKPW PIIGNLNLIG NLPHRSIHEL 
61 SLKYGPVMQL QFGSFPVWG SSVEMAKIFL KSMDINFVGR PKTAAGKYTT YNYSDITWSP 
121 YGPYWRQARR MCLTELFSTK RLDSYEYIRA EELHSLLHNL NKISGKPIVL KDYLTTLSLN 
181 VI SRMVLGKR YLDESENSFV NPEEFKKMLD ELFLLNGVLN IGDSIPWIDF MDLQGYVKRM 
241 KWSKKFDKF LEHVTDEHNI RRNGVENYVA KDMVDVLLQL ADDPKLEVKL ERHGVKAFTQ 
301 DMLAGGTESS AVTVEWAISE LLKKPEIFKK ATEELDRVIG QNRWVQEKDI PNHPYIEAIV 
361 KETMRLHPVA PMLVPRECRE DIKVAGYDVQ KGTRVLVSVW TIGRDPTLWD EPEVFKPERF 
421 HEKSIDVKGH DYELLPFGAG RRMCPGYSLG LKVIQASLAN LLHGFNWSLP DNMTPEDLNM 
481 DEIFGLSTPK KFPIATVIEP RLSPKLYSV 
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NAME D2 68-AE2 

ORGANISM NICOTIANA TABACUM 
SEQ. ID. NO. 269 

1 T GCAATAT AG TTTTCCTAGT CAGTTCTAGC CTCCTTTTCC TTAGAAATAA TGGATTATCA 
' 61 TATTTCTTTC CATTTTCAAG CTCTTTTAGG GCTTTTAGCC TTTGTGTTCT TGTCTATTAT 
121 CTTAT GGAGA AGAACACTCA CTTCAAGAAA ATTAGCCCCT GAAATCCCAG GGGCATGGCC 
181 TAT TAT AGGC CATCTTCGTC AGCTGAGTGG TACT GATAAG AATATCCCAT TTCCCCGAAT 
241 ATTGGGCGCT TTGGCAGATA AATAT GGACC TGTCTTCACA CTGAGAATAG ' GGATGTACCC 
301 CTATTT GAT T GTCAACAATT GGGAAGCAGC TAAGGATTGT CTCACAACGC AT GAT AAGGA 
361 CTTCGCTGCC CGACCAACTT CTATGGCTGG TGAAAGCATC GGGTACAAGT ATGCGAGGTT 
421 TACTTATGCT AATTTTGGTC CTTATTATAA CCAAGTGCGC AAACTAGCCC TACAACATGT 
4 81 ACT CT C GAGT ACTAAACTCG AGAAAAT GAA AC AC AT AC GT GTTTCTGAAT TGGAAACTAG 
541 CAT CAAAGAA TTATATTCTT TGACGCTGGG CAAAAACAAC AT GCAAAAAG TGAATATAAG 
601 TAAATGGTTT GAACAATTGA CTTTAAACAT AATCGTGAAG ACAATTTGTG GCAAGAGATA 
661 TAG C AAC AT A GAGGAGGATG AAGAGGCACA ACGTTTCAGA AAGGCATTTA AGGGCATCAT 
721 GTTTGTTGTA' GGGCAAATTG TTTTATATGA CGCAATTCCA TTCCCATTGT TCAAATACTT 
781 TGATTTCCAA GGT CATAT AC AATTGATGAA CAAAATTTAT AAAGACTTAG ATTCTATTCT 
841 TCAAGGATGG TTGGATGATC AT AT GAT GAA CAAGGAT GTA AACAATAAGG AT CAAGAT GC 
901 CATAGAT GCC AT GCTTAAGG TAACACAACT TAATGAATTC AAAGCCTATG GTTTTTCTCA 
961 GGCCACTGTG AT CAAGT CGA CAGTCTTGAG TTTGATCTTA GAT GGAAAT G ACACAACCGC 
1021 TGTTCATTTG ATATGGGTAA TGTCCTTATT ACT GAACAAT CCACATGTTA TGAAACAAGG 
1081 CCAAGAAGAG AT AGACAT GA AAGT GGGTAA AGAGAGGTGG ATTGAAGATA CT GACAT AAA 
1141 AAATTTAGT G TACCTTCAGG CTAT CGTTAA AGAGACATTG CGCTTGTATC CACCTGTTCC 
1201 TTTTCTTTTA C C AC AC GAAG CAGT GCAAGA TTGTAAAGTG ACTGGTTACC ACATTCCTAA 
1261 AGGTACTCGT CTAT AT AT CA ATGCGTGGAA AGTACAT CGC GATTCTGAAA TTTGGTCAGA 
1321 GCCCGAAAAG TTTATGCCCA ATAGATT CTT GACT AG C AAA GCAAATATAG ATGCTCGCGG 
1381 TCAAAATTTT GAATT T AT AC CGTTTGGTTC TGGGAGACGG T CAT GT CCAG GGTTAGGTTT 
1441 TGCGACTTTA GTGACACATC TGACTTTTGG TCGCTTGCTT CAAGGTTTTG ATTTTAGTAA 
1501 GCCAT CAAAC ACGCCAATTG ACATGACAGA AGGC GT AGGC GTTACTTTGC CTAAGGTTAA 
1561 T CAAGTT GAA GTT CTAATTA CCCCTCGTTT ACCTT CTAAG CTTTATTTAT TTTGAAAGTG 
1621 C AAAT CAT CA AT CAT GGGT T GAGTAATTAG T GAT ACT 

SEQ. ID. NO. 270 

1 MDYHISFHFQ ALLGLLAFVF LSIILWRRTL TSRKLAPEIP GAWPIIGHLR QLSGTDKNIP 

61 FPRILGALAD KYGPVFTLRI GMYPYLIVNN WEAAKDCLTT HDKDFAARPT SMAGESIGYK 

121 YARFTYANFG PYYNQVRKIiA LQHVLSSTKL EKMKHI RVS E LETSIKELYS LTLGKNNMQK 

181 VNISKWFEQL TLNIIVKTIC GKRYSNIEED EEAQRFRKAF KGIMFWGQI VLYDAIPFPL 

241 FKYFDFQGHI QLMNKIYKDL DSILQGWLDD HMMNKDVNNK DQDAIDAMLK VTQLNEFKAY 

301 GFSQATVIKS TVLSLILDGN DTTAVHLIWV MSLLLNNPHV MKQGQEEIDM KVGKERWIED 

361 TDIKNLVYLQ AIVKETLRLY PPVPFLLPHE AVQDCKVTGY HIPKGTRLYI NAWKVHRDSE 

421 IWSEPEKFMP NRFLTSKANI DARGQNFEFI PFGSGRRSCP GLGFATLVTH LTFGRLLQGF 

481 DFSKPSNTPI DMTEGVGVTL PKVNQVEVLI TPRLPSKLYL F 



WO 2004/035745 



PCT/US2003/032722 



77/111 



FIG. 136 



NAME D283-AC1 
ORGANISM N I COT I ANA TABACUM 

SEQ. ID. 5°j^a\GTGAA AAT GGACGCA CTACTTCAAA TGACAGTAAC AGCATCTTGT GCTGCCATAG 
61 TAATTACTCT GCTGGTGTGT AT AT GGAGAG TGCTGAACTG GATTTGGTTC AGACCAAAGA 
121 AATTGGAGTT GTTGTTGAGA AAACAAGGTT TGGAAGGAAA TTCTTACAAG GTTTTGTATG 
181 GGGACAT GAA AGAGTTTTCT GGGATGATTA AGGAAGCATA CTCAAAGCCT ATGAGTCTAT 
241 CTGATGATGT AGCACCAAGA CTGATGCCTT TCTTTCTTGA AACCATCAAA AAATAT GGAA 
301 AAAGATCCTT TATATGGTTT GGTCCAAGAC CACTAGTATT GATTAT GGAT CCTGAGCTTA 
361 TAAAGGAAGT ACT CT CAAAA ATCCATCTGT ATCAAAAGCC TGGTGGAAAT CCATTAGCAA 
421 CACTATTGGT ACAAGGAATA GCAACCTATG AGGAAGACAA AT GGGCCAAA CATAGAAAAA 
481 TCATCAATCC CGCTTTCCAT CTAGAGAAGC TAAAGCTTAT GCTTCCAGCA TTTCGCTTAA 
541 GCTGTAGTGA GATGCTGAGC AAATGGGAAG ACATTGTTTC AGCTGATAGC T CACAT GAGA 
601 TAGATGTATG GTCTCACCTT GAGCAATTGA CTTGCGATGT GAT CT CT C GG ACAGCTTTTG 
661 GCAGTAGTTA TGAAGAAGGT AGAAAGATTT TTGAACTTCA AAAGGAACAA GCTCAGTATC 
721 TTGTGGAAGT TTTCCGCTCC GTTTATATCC CAGGAAGGAG ATTTTTGCCA ACAAAGAGGA 
781 ATAGAAGAAT GAAGGAAATA AAAAAGGATG TCCGGGCATC AATTAAAGGT ATTATTGATA 
841 AAAGATT GAA GGCAATGAAA GCAGGGGACA CCAATAATGA .GGATCTATTG GGTATATTAC 
901 TGGAATCGaA TATTAAAGAA ATTGAACAGC ACGGAAACAA GGATTTTGGA ATGAGCATTG 
961 AAGAAGTCAT TGAAGAATGC AAGTTATTCT ATTTTGCTGG CCAAGAAACT ACATCAGTGT 
1021 TACTCCTATG GTCTCTAGTG TTGTTGAGCA GGTAT CAAGA TTGGCAGGCA CGGGCCAGAG 
1081 AAGAAATCTT GCAAGTCTTT GGCAGTCGAA AACCAGATTT TGACGGATTA AAT CAT CTAA 
1141 AAATTGTGAC AATGATCTTG TACGAGTCTT TAAGGCTGTA TCCCTCACTA ATAACACTTA 
1201 CCCGCCGGTG TAATGAAGAC ATT GTATTAG GAGAACTAT C TCTACCAGCT GGTGTTCTAG 
1261 TCTCTTTGCC ATTGATTTTG TTGCATCATG AT GAAGAGAT ATGGGGTGAA GAT GCAAAGG 
1321 AGTTCAAACC AGAGAGATTT AGAGAAGGAA TAT CAAGT GC AACAAAGGGT CAACTCACAT 
1381 ATTTTCCATT TAGCTGGGGT C CT AGAAT AT GTATT GGACA AAATTTTGCC AT GTTAGAAG 
1441 CAAAGATGGC TCTGTCTATG ATCCTGCAAC GCTTCTCTTT T GAACTGT CT CCGTCTTATG 
1501 CACATGCCCC TCGGTCCATA ATAACCGTTC AGCCTCAGTA TGGTGCTCCA CTTATTTTCC 
1561 ACAAACTATA ATTTTGGTAC TTCTACTAAT ATTTTAGGGT TTATTCAGAC TCAAAAAAAA 

SEQ. ID. J°^ T 5JJ gcAM . ^tllvciWR VLNWIWFRPK KLELLLRKQG LEGNSYKVLY GDMKEFSGMI 
61 KEAYSKPMSL SDDVAPRLMP FFLETIKKYG KRSFIWFGPR PLVLIMDPEL IKEVLSKIHL 
121 YOKPGGNPLA TLLVQGIATY EEDKWAKHRK IINPAFHLEK LKLMLPAFRL SCSEMLSKWE 
181 DIVSADSSHE IDWSHLEQL TCDVISRTAF GSSYEEGRKI FELQKEQAQY LVEVFRSVYI 
241 PGRRFLPTKR NRRMKEI KKD VRASIKGIID KRLKAMKAGD TNNEDLLGIL LESNIKEIEQ 
301 HGNKDFGMSI EEVIEECKLF YFAGQETT S V LLLWSLVLLS RYQDWQARAR EEILQVFGSR 
361 KPDFDGLNHL KIVTMILYES LRLYPSLITL TRRCNEDIVL GELSLPAGVL VSLPLILLHH 
421 DEEIWGEDAK EFKPERFREG ISSATKGQLT YFPFSWGPRI CIGQNFAMLE AKMALSMILQ 
481 RFSFELSPSY AHAPRSIITV QPQYGAPLIF HKL 
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NAME D244-AB6 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 273 

1 TGCAATATAG TTTTCCTAGT CAGTT CTAGC CTCCTTTTCC TTAGAAATAA TGGATTATCA 
61 TATTTCTTTC CATTTTCAAG CTCTTTTAGG GCTTTTAGCC TTTGTGTTCT TGTCTATTAT 
121 CTTATGGAGA AGAAC ACT CA CTTCAAGAAA ATTAGCCCCT GAAATCCCAG GGGCAT GGCC 
181 TAT TATAGGC CATCTTCGTC AGCTGAGTGG TACT GATAAG AAT AT C C CAT TTCCCCGAAT 
241 ATTGGGCGCT TTGGCAGATA AAT AT GGACC TGTCTTCACA CTGAGAATAG GGATGTACCC 
3 01 CTATTTGATT GTCAACAATT GGGAAGCAGC TAAGGATTGT CTCACAACGC AT GATAAGGA 

3 61 CTTggCTGCC CGACCAACTT CTATGGCTGG TGAAAGCATC GGGTACAAGT ATGCGAGGTT 
421 TACT TAT GCT AATTTTGGTC CTTATTATAA CCAAGT GCGC AAACTAGCCC TACAACATGT 

4 81 ACTCTCGAGT ACTAAACT CG AGAAAATGAA ACACATACGT GTTTCTGAAT TGGAAACTAG 
541 CAT C AAAGAA TTATATT CTT TGACGCTGGG CAAAAACAAC AT GCAAAAAG TGAATATAAG 
601 TAAAT GGTTT GAACAATTGA CTTTAAACAT AATCGTGAAG ACAATTTGTG GCAAGAGATA 
661 TAGCAACATA GAGGAG GAT G AAGAG GCACA ACGTTTCAGA AAGGCATTTA AGGGCATCAT 
721 GTTTGTTGTA GGGCAAATTG TTTTATATGA CGCAATTCCA TTCCCATTGT TCAAATACTT 

7 81 TGATTTCCAA GGT CATATAC AAT T GAT G AA CAAAAT T TAT AAAGACT TAG ATTCTATTCT 

8 41 TCAAGGATGG TTGGATGATC AT AT GAT GAA CAAGGATGTA AACAATAAGG ATCAAGATGC 
901 CAT AGAT GC C ATGCTTAAGG TAACACAACT TAATGAATTC AAAGC CTAT G GTTTTTCTCA 

9 61 GGCCACTGTG AT CAAGT CGA CAGTCTTGAG TTTGATCTTA GAT GGAAAT G ACACAACCGC 
1021 TGTTCATTTG ATAT GGGTAA TGTCCTTATT ACT GAACAAT CCACATGTTA TGAAACAAGG 
10 81 C CAAG AAGAG AT AGAC AT GA AAGT GGGTAA AGAGAGGT GG ATTGAAGATA CT GACATAAA 
1141 AAATTTAGTG TACCTTCAGG CTAT C GTTAA AGAGACATT G CGCTTGTATC CACCTGTTCC 
12 01 TTTTCTTTTA CCACACGAAG CAGT GCAAGA TTGTAAAGTG ACT GGTTACC ACATTCCTAA 

12 61 AGGTACTCGT CHAT ATAT CA ATGCGTGGAA AGTACAT CGC GATCCTGAAA TTTGGTCAGA 
1321 GCCCGAAAAG TTTATGCCCA ATAGATT CTT GACTAGCAAA GCAAATATAG AT GCT CGCGG 

13 81 TCAAAATTTT GAAT TTAT AC CGTTTGGTTC TGGGAGACGG T CAT GT CCAG GGATAGGTTT 
1441 TGCGACTTTA GTGACACATC TGACTTTTGG TCGCTTGCTT CAAGGTTTTG ATTTTAGTAA 
1501 GCCAT CAAAC ACGCCAATTG ACATGACAGA AGGCGTAGGC GTTACTTTGC CT AAG GT T AA 
1561 T CAAGT T GAA GT T C T AAT T A CCCCTCGTTT ACCTT CTAAG CTTTATTTAT TTTGAAGGTG 
1621 CAAAT CAT CA ATCATGGCTT GAGTAATTAG TTATACTTTA ATATGTTTCT C 

SEQ . ID. NO. 274 

1 MDYHISFHFQ ALLGLLAFVF LSIILWRRTL TSRKLAPEIP GAWPIIGHLR QLSGTDKNIP 

61 FPRILGALAD KYGPVFTLRI GMYPYLIVNN WEAAKDCLTT HDKDLAARPT SMAGES I GYK 

121 YARFTYANFG PYYNQVRKIiA LQHVLSSTKL EKMKHIRVSE LETSIKELYS LTLGKNNMQK 

181 VNISKWFEQL TLNIIVKTIC GKRYSNIEED EEAQRFRKAF KGIMFWGQI VLYDAIPFPL 

241 FKYFDFQGHI QLMNKIYKDL DSILQGWLDD HMMNKDVNNK DQDAI DAMLK VTQLNEFKAY 

301 GFSQATVIKS TVLSLILDGN DTTAVHLIWV MSLLLNNPHV MKQGQEEI DM KVGKERWIED 

361 TDIKNLVYLQ AIVKETLRLY PPVPFLLPHE AVQDCKVTGY HIPKGTRLYI NAWKVHRDPE 

421 IWSEPEKFMP NRFLTSKANI DARGQNFEFI PFGSGRRSCP GIG FAT LVTH LTFGRLLQGF 

481 DFSKPSNTPI DMTEGVGVTL PKVNQVEVLI TPRLPSKLYL F 
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NAME D205-BE9 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 275 

• 1 TTTGATTCAA CCAT GGAGAA CCAATACTCC TACTCATTCT CTTCCTACTT CTACTTAGCT 
61 ATAGTACTGT TTCTTCTTCC AATTTTGGTC AAATATTTCT T C CAT C GGAG AAGAAATTTA 
121 CCTCCAAGTC CATTTTCTCT TCCAATAATT GGTCACCTTT ACCTTCTCAA GAAAACTCTC 
181 CATCTCACTC TAACAT CCTT AT CAGCTAAA TATGGTCCTG TTTTATACCT CAAATTGGGC 
241 TCTATGCCTG TGATTGTTGT GTCCTCACCA TCTGCTGTTG AAGAAT GTTT AACCAAGAAT 
301 GAT AT CAT AT TCGCAAATAG GCCCAAGACC GTGGCTGGTG ACAAGT T T AC CTACAATTAT 
361 ACT GT TTAT G TTTGGGCACC CTATGGCCAA CTTTGGAGAA TTCTTCGCCG ATTAACTGTC 
421 GTTGAACTCT TCTCTTCACA TAGCCTACAG AAAACTT CTA TCCTTAGAGA TCAAGAAGTT 
481 GCAATATTTA TCCGTTCGTT ATACAAATTC TCAAAGGATA GTAGCAAAAA AGTCGATTTG 
541 ACCAACTGGT CTTTTACTTT GGTTTTCAAT CTTAT GACCA AAATTATTGC TGGGAGACAT 
601 ATTGTGAAGG AGGAAGATGC TGGCAAGGAA AAGGGCATTG AAATTATTGA AAAACTTAGA 
661 GGGACTTTCT TAGTAACTAC ATCATTCTTG AATATGTGTG ATTTCTTGCC AGTATTCAGG 
721 TGGGTTGGTT ACAAAGGGCA GGAGAAGAAG ATGGCCTCAA TTCACAATAG AAGAAATGAA 
781 TTCTTGAACA GCTTGCTTGA TGAATTTCGA CACAAGAAAA GTAGTGCTTC ACAATCTAAC 
841 ACAACTGTTG GAAAC AT GGA GAAGAAAACC ACACT GATTG AAAAGCTCTT GTCTCTTCAA 
901 GAAT CAGAGC CTGAATTCTA CACTGATGAT ATCATCAAAA GTATTAT GCT GGTAGTTTTT 
961 GTT GCAGGAA CAGAGAC CTC AT CAACAACC ATCCAATGGG TAAT GAGGCT TCTTGTAGCT 
1021 CACCCTGAGG CATT GTATAA GCTACGAGCT GACATT GACA GTAAAGTTGG GAATAAGCGC 
1081 TTGCTGAATG AAT CAGACCT CAACAAGCTT CCGTATTTGC ATTGTGTTGT TAAT GAGAC A 
1141 AT GAGATTAT ACACTCCGAT ACCACTTTTA TTGCCTCATT ATTCAACTAA AGATT GTATT 
1201 GT GGAAGGAT AT GAT GTACC AAAACATACA ATGTTGTTTG TCAACGCTTG GGCCATTCAC 
1261 AGGGATCCCA AGGTATGGGA GGAGCCTGAC AAGTTCAAGC CAGAGAGATT T GAGGCAACA 
1321 GAAGGGGAAA CAGAAAGGTT CAATTACAAG CTTGTACCAT TTGGAATGGG GAGAAGAGCG 
1381 TGCCCTGGAG CTGATATGGG GTT GCGAGCA GTTTCTTTGG CATTAGGTGC ACTTATTCAA 
1441 TGCTTTGACT GGCAAATTGA GGAAGCGGAA AGCTTGGAGG AAAGCTATAA TTCTAGAATG 
1501 ACTAT GCAGA ACAAGCCTTT GAAGGTT GTC TGCACTCCAC GCGAAGATCT TGGCCAGCTT 
1561 CTATCCCAAC TCTAAGGCAA TTTATCAATG CCAAACGTAA TCTTCATCTA CCACTATG 

SEQ. ID. NO. 276 

1 MENQYSYSFS SYFYLAIVIiF LLPILVKYFF HRRRNLPPSP FSLPIIGHLY LLKKTLHLTL 

61 TSLSAKYGPV LYLKLGSMPV IWSSPSAVE ECLTKNDIIF ANRPKTVAGD KFTYNYTVYV 

121 WAPYGQLWRI LRRLTWELF SSHSLQKTSI LRDQEVAIFI RSLYKFSKD3 SKKVDLTNWS 

181 FTLVFNLMTK IIAGRHIVKE EDAGKEKGIE IIEKLRGTFL VTTSFLNMCD FLPVFRWVGY 

241 KGQEKKMAS I HNRRNEFLNS LLDEFRHKKS SASQSNTTVG NMEKKTTLIE KLLSLQESEP 

301 EFYTDDIIKS I ML WFVAGT ETSSTTIQWV MRLLVAHPEA LYKLRADIDS KVGNKRLLNE 

361 SDLNKLPYLH CWNETMRLY TPIPLLLPHY STKDCIVEGY DVPKHTMLFV NAWAIHRDPK 

421 VWEEPDKFKP ERFEATEGET ERFNYKLVPF GMGRRACPGA DMGLRAVS LA LGALIQCFDW 

481 QIEEAESLEE SYNSRMTMQN KPLKWCTPR EDLGQLLSQL 



WO 2004/035745 
FIG. 139 



80/111 



PCT/US2003/032722 



NAME D13 6-AF4 

ORGANISM N I COT I ANA TABACUM 

SEQ. ID. NO. 277 

1 CCTTTTTAAG AT GTATTTAA GATTTAAGAT TTAAGATGAA GCAACT GAGG TAAGTCCTTT 
61 CAAGGAGTAG TTGTCACTTC TGAGAATGGA GAT GAT GT AC AGCATAATAG CAGCAGCCAG 
121 TATT GCAATT ATCTTGGTAT ATACAT GGAA AGTGTTGAAT TGGGCTTGGT TTGGGCCGAA 
181 GAAAAT GGAG AAAT GCTTAA GACAGAGGGG T CT CAAGGGA AAT CCT TATA AGCTACTCTA 
241 TGGAGATCTA AACGAACTGA CAAAAAGCAT AATAGAAGCC AAGTCTAAGC CCATCAATTT 
301 CT C T GAT GAT ATTGCTCAAA GGCTCATCCC TTTTTTTCTT GACGCCATCA ACAAAAATGG 
361 TAAAAACT CC TTCGTCTGGC TTGGACCGTA TCCAATAGTG TTGATCACGG AT CCT GAGCA 
421 TTTAAAGGAG ATTTTCACAA AGAAT TAT GT GT AT CAAAAG CAAACT CAT C CCAATCCATA 
481 CGCCAAGCTA TTAGCTCACG GTCTTGTCAG CCTT GAGGAA GACAAAT GGG C CAAACAC AG 
541 AAAAAT CAT T AGTCCTGCCT TCCATGTCGA GAAGCTAAAG CATATGCTGC CTGCATTTTA 
601 TCTGAGTTGT AGT GAAATGA TAAGCAAATG GGAGGAGGTT GTT C CAAAAG AAAC AT CAT T 
661 CGAGCTCGAT GTATGGCCAG ACCTTCAAAT AATGACCAGT GAAGT CATTT CTCGCACTGC 
721 ATTTGGGAGT AGCTATGAAG AAGGAAGAAT AGTATTTGAA CTT CAGAAAG AACAAGCTGA 
781 GTATGTAATG GACATAGGAC GTTCAATTTA TAT AC CAGGA TCAAGGTTCT TGCCTACTAA 
841 AAGGAACAAA AGAAT GCTGG AAATTGAAAA GCAAGT GCAA ACAACAATTA GGCGTAT CAT 
901 CGACAAAAGA TTGAAGGCAA TGGAAGAAGG GGAGACTAGT AAAGAT GACT TATTAGGCAT 
961 AT TACTT GAA TCCAATTTGA AAGAAATTGA ACTT CAT GGA AGAAATGACT TGGGAATAAC 
1021 AACGTCAGAA GTGATTGAAG AGT GCAAGT T AT T CTATTT T GCCGGCCAAG AGACCACTTC 
1081 AGTGTTGCTT GTTTGGACAA TGATTTTGTT GT GCTTACAT CCAGAGT GGC AAGTACGTGC 
1141 CAGAAAG GAA GT GTT GCAGA TCTTTGGAAA T GAT AAAC CA GATTTGGAAG GACTAAGTCG 
1201 CTT GAAAAT T GTAACAATGA TCTTGTACGA GACGT TACGC CTATTCCCCC CATTACCAGC 
1261 ATTTGGTAGA AGGAACAAAG AAGAAGT CAA ATTAGGGGAG CTACAT CTAC CGGCTGGAGT 
1321 GTTACTCGTT AT AC CAGCAA T C T T AGT AC A TTAT GATAAG GAAATAT GGG GTGAAGATGC 
1381 AAAGGAATTC AAAC C AGAAA GATTCAGTGA AGGAGTGTCA AAGGGAACAA AT GGACAAGT 
1441 CT CAT TTAT A CCATTTAGCT GGGGACCTCG TGTTTGCATT GGACAAAACT T CGCAAT GAT 
1501 GGAAGCAAAA AT GGCAGTAA CTAT GAT ACT ACAAAAATTC TCCTTTGAAC TATCCCCTTC 
1561 T TAT AC AC AT GCTCCATTTG CAATTGTGAC TATTCATCCC CAGTATGGTG CTCCTCTGCT 
1621 TAT GC GCAGA CTTTAAAACA TATGTTGCTG ATATTTAAGA TCAGTGGCGT TTTATT 

SEQ. ID. NO. 278 

1 MEMMYSIIAA ASIAIILVYT WKVLNWAWFG PKKMEKCLRQ RGLKGNPYKL LYGDLNELTK 

61 SIIEAKSKPI NFSDDIAQRL IPFFLDAINK NGKNS FVWLG PYPIVLITDP EHLKEIFTKN 

121 YVYQKQTHPN PYAKLLAHGL VSLEEDKWAK HRKIISPAFH VEKLKHML PA FYLSCSEMIS 

181 KWEEWPKET SFELDVWPDL QIMTSEVISR TAFGSSYEEG RIVFELQKEQ AEYVMDIGRS 

241 IYIPGSRFLP TKRNKRMLEI EKQVQTTIRR IIDKRLKAME EGETSKDDLL GILLESNLKE 

301 IELHGRNDLG ITTSEVIEEC KLFYFAGQET TSVLLWTMI LLCLHPEWQV RARKEVLQ I F 

3 61 GNDKPDLEGL SRLKIVTMIL YETLRLFPPL PAFGRRNKEE VKLGELHLPA GVLLVIPAIL 
421 VHYDKEIWGE DAKEFKPERF SEGVSKATNG QVSFIPFSWG PRVCIGQNFA MMEAKMAVTM 

4 81 ILQKFSFELS PSYTHAPFAI VTIHPQYGAP LLMRRL 
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NAME D101-BA2 

ORGANISM N I COT I ANA TABACUM 

SEQ . ID. NO. 279 

1 CTAAATTTCA TAT AC CT TT A GTACTCTTGA AATTTTCAAA TAATGGTTTA TCTTCTTTCT 
61 CCCATAGAAG CCATTGTAGG ATTTGTAACC TTTTCATTTC TATT CTACTT TCTATGGACC 
121 AAAAAACAAT CAAAAATCTT AAACC CACT A CCTCCAAAAA TCCCAGGTGG AT GGC CAGTA 
181 AT C GGC CAT C TCTTTTATTT CAAGAACAAT GGCGAT GAAG ATCGCCATTT TTCTCAAAAA 
241 CTCGGTGACT TAGCT GACAA ATATGGTCCC GTCTTCACTT TCCGGTTAGG GTTTCGCCGT 
301 TTCTTGGCGG T GAGTAGTT A TGAAGCTATG AAAGAAT GCT T CACT AC C AA T GAT AT C CAT 
361 TTCGCCGATC GGCCATCTTT ACT CTACGGA GAATACCTTT GCTATAATAA TGCCATGCTT 
421 GCTGTTGCCA AATATGGCCC TTACTGGAAA AAAAATCGAA AGTTAGTCAA TCAAGAAGTT 
481 CTCTCCGTTA GTCGGCTCGA AAAATTCAAA CAT GT T AGAT TTT CTATAAT TCAGAAAAAT 
541 ATTAAACAAT T GTATAATT G TGATTCACCA AT GGT GAAGA TAAACCTTAG TGATT GGATA 
601 GATAAATTGA CATTCGACAT CATTTT GAAA ATGGTTGTTG G GAAGAACT A TAATAAT GGA 
661 CAT GGAGAAA TACT CAAAGT TGCTTTTCAG AAATT CAT GG TTCAAGCTAT GGAGAT GGAG 
721 CTCTATGATG TTTTTCACAT TCCATTTTTC AAGTGGTTGG AT CTTACAGG GAATATTAAG 
781 GCTAT GAAAC AAACTTT CAA AGACATTGAT AATATTATCC AAGGTTGGTT AGATGAGCAC 
841 ATTAAGAAGA GAGAAACAAA GGATGTTGGA GGT GAAAAC G AACAAGATTT TAT AGAT GT G 
901 GTGCTTTCCA AGAT GAGCGA CGAACATCTT GGCGAGGGTT ACT CT CAT GA CACAACCATC 
961 AAAGCAACTG TATT CACT TT GGTCTTGGAT GCAACAGACA CACT TGCACT T CAT AT AAAG 
1021 T GGGTAAT GG C GTTAAT GAT AAACAATAAG CAT GT CAT GA AGAAAGCACA AGAAGAGATG 
1081 GACACAATTG TT GGTAGAGA TAGAT GGGTA GAAGAGAGTG AT AT CAAG AA TTT GGT GT AT 
1141 C T C CAAG CAA TTGTTAAAGA AGTATTAC GA T T AC AT C C AC CTGCACCTTT GTCAGTGCAA 
1201 CACCTATCTG TGGAAGATTG TGTTGTCAAT GGGTACCATA TTCCTAAGGG GACT GCACTA 
1261 CTTAC CAATA TTAT GAAAC T ACAGC GAGAT CCTCAAACAT GGCCAAATCC TGATAAATTC 
1321 GATCCAGAGA GATTCCTGAC GACT CAT GCT ACTATTGACT ACCGCGGGCA G CACT AT GAG 
1381 TTGATCCCCT TTGGTACGGG GAGACGAGCT TGTCCCGCGA TGAATTATTC ATT GCAAGT G 
1441 GAACAC CT TT CAATTGCTCA TAT GAT C CAA GGTTTCAGTT TT GCAACTAC GACCAAT GAG 
1501 CCTT T GGATA TGAAACAAGG TGTGGGTTTA ACTTTACCAA AGAAGACT GA TGTTGAAGTT 
1561 CTAATTACCC CTCGTTT 

SEQ. ID. NO. 280 

1 MVYLLSPIEA IVGFVTFSFL FYFLWTKKQS KILNPLPPKI PGGWPVIGHL FYFKNNGDED 
61 RHFSQKLGDL ADKYGPVFTF RLGFRRFLAV SSYEAMKECF TTNDIHFADR PSLLYGEYLC 
121 YNNAMLAVAK YGPYWKKNRK LVNQEVLSVS RLEKFKHVRF SIIQKNIKQL YNCDSPMVKI 
181 NLSDWIDKLT FDIILKMWG KNYNNGHGEI LKVAFQKFMV QAMEMELYDV FHIPFFKWLD 
241 LTGNIKAMKQ TFKDIDNIIQ GWLDEHIKKR ETKDVGGENE QDFIDWLSK MS DEHLGEGY 
301 SHDTTIKATV FT LVLDAT DT LALHIKWVMA LMINNKHVMK KAQEEMDT IV GRDRWVEESD 
361 IKNLVYLQAI VKEVLRLHPP APLSVQHLSV EDCWNGYHI PKGTALLTNI MKLQRDPQTW 
421 PNPDKFDPER FLTTHATIDY RGQHYELIPF GTGRRACPAM NYSLQVEHLS IAHMIQGFSF 
481 ATTTNEPLDM KQGVGLTLPK KTDVEVLITP R 
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NAME D130-AA1 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID- NO. 281 

1 CTTTTTCTCC CCAAAAAAGA GCTCATTTCC CTTGTCCCCA AAAATGGATC TTCTCTTACT 
61 AGAGAAGACC TTAATTGGTC TCTTCTTTGC CATTTTAATC GCTGTAATTG TCTCTAGACT 
121 TCGTTCAAAG CGTTTTAAGC TTCCCCCAGG ACCAATCCCA GTACCAGTTT TTGGTAATTG 
181 GCTTCAAGTT GGTGATGATT TAAACCACAG AAAT CTTACT GATTTTGCCA AAAAATTTGG 
241 TGATCTTTTC TTGTTAAGAA TGGGCCAGCG TAATTTAGTT GTTGTGTCAT CTCCTGAATT 
301 AGCTAAAGAA GTTTTACACA CACAAGGTGT TGAATTTGGT T CAAGAACAA GAAATGTTGT 
361 ATT T GAT AT T TTTACTGGAA AAGGT CAAGA TATGGTTTTT ACT GT AT AT G GTGAACACTG 
421 GAGAAAAATG AGGAGAATTA TGACTGTACC ATTTTTTACT AAT AAAGTT G TGCAGCAATA 
481 TAGAGGGGGG TGGGAGTTTG AAGT GGCAAG T GTAATT GAG GATGTGAAGA AAAATCCTGA 
541 ATCTGCTACT AAT GGGATT G TATNAAGGAG GAGATTACAA TT GAT GAT GT ATAATAATAT 
601 GTTTAGGATT AT GTT T GAT A GGAGATTTGA GAGT GAAGAT GAT CCTTT GT TTGTTAAGCT 
661 TAAGGCTTTG AATGGTGAAA GGAGTAGATT GGCT CAGAGT TTT GAGTAT A ATTATGGTGA 
721 TTTTATTCCC ATTTTGAGGC CTTTTTTGAG AGGTTATTTG AAGATCTGTA AAGAAGTTAA 
781 GGAGAAGAGG CTGCAGCTTT TCAAAGATTA CTTTGTTGAT GAAAGAAAGA AGCTTTCAAA 
841 TACCAAGAGC T T GGACAGC A ATGCTCTGAA ATGTGCGATT GAT CACATT C TTGAGGCTCA 
901 ACAGAAGGGG GAG AT C AAT G AGGACAACGT TCTTTACATT GTTGAAAACA TCAATGTTGC 
961 TGCTATAGAA ACCACATTAT GGTCAATTGA GTGGGGTATC GCC GAGTTAG TCAACCACCC 
1021 TCACATCCAA AAGAAACTCC GCGACGAGAT TGACACAGTT CTTGGCCCAG GAGT GCAAGT 
1081 GACT GAACCA GACACCCACA AGCTT CCATA CCTTCAGGCT GT GAT CAAGG AGACGCTTCG 
1141 TCTCCGTATG GCAATTCCTC TATTAGTCCC ACACATGAAC CTTCACGATG CAAAGCTTGG 
1201 CGGGTTTGAT AT T C CAGCAG AGAGCAAAAT CTT GGTTAAC GCTTGGTGGC TAGCTAACAA 
1261 CCCGGCTCAT TGGAAGAAAC CCGAAGAGTT CAGACCCGAG AGGTTCTTCG AAGAGGAGAA 
1321 GCACGTTGAG GCCAATGGCA ATGACTTCAG ATATCTTCCG TTTGGCGTTG GTAGGAGGAG 
1381 TTGCCCTGGA ACTATACTTG CATT GCCAAT TCTTGGCATT ACTTTGGGAC GTTT 

SEQ. ID. NO. 282 

1 MDLLLLEKTL IGLFFAILIA VIVSRLRSKR FKLPPGPIPV PVFGNWLQVG DDLNHRNLTD 
61 FAKKFGDLFL LRMGQRNLW VSSPELAKEV LHTQGVEFGS RTRNWFD I F TGKGQDMVFT 
121 VYGEHWRKMR RIMTVPFFTN KWQQYRGGW EFEVASVIED VKKNPESATN GIVLRRRLQL 
181 MMYNNMFRIM FDRRFESEDD PLFVKLKALN GERSRLAQSF EYNYGDFIPI LRPFLRGYLK 
241 ICKEVKEKRL QLFKDYFVDE RKKLSNTKSL DSNALKCAID HILEAQQKGE INEDNVLYIV 
301 ENINVAAIET TLWSIEWGIA ELVNHPHIQK KLRDEIDTVL GPGVQVTEPD THKLPYLQAV 
361 IKETLRLRMA I PLLVPHMNL HDAKLGGFDI PAESKILVNA WWLANNPAHW KKPEEFRPER 
421 FFEEEKHVEA NGNDFRYLPF GVGRRSCPGT ILALPILGIT LGR 



WO 2004/035745 



83/111 



PCT/US2003/032722 



FIG. 142 



NAME D136-AD5 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 283 

1 CCAAATTAGA GCAAGAAATT AACAAGTCTA GTTACCTTCT CCCTTTTTAA GAGTATTTAA 
61 GATTTAAGAT TTAAGATGAA GCAACTGAGG TAAGTCCTTT CAAGGAGTAG TTGTCACTTC 
121 TGAGAATGGA GATGATGTAC AGCATAATAG CAGCAGCCAG TAT T GCAATT AT CTT GGTAT 
181 ATACAT GGAA AGTGTTGAAT TGGGCTTGGT TTGGGCCAAA GAAAAT GGAG AAAT GCTTAA 
241 GACAGAGGGG T CT CAAGGGA AATCCTTATA AG CT AC T CT A T GGAGAT CTA AACGAACTGA 
301 CAAAAAGCAT AATAGAAGCC AAGTCTAAGC CCATCAATTT CT CT GAT GAT ATTGCTCAAA 
361 GGCTCATCCC TTTTTTTCTT GACGCCATCA ACAAAAATGG TAAAAACTCC TTCGTCTGGC 
421 TTGGACCGTA TCCAATAGTG TTGATCACGG AT CCT GAGCA TTTAAAGGAG ATTTT CACAA 
481 AGAATTATGT GTATCAAAAG CAAACT CAT C CCAATCCATA CGCCAAGCTA TTAGCTCACG 
541 GTCTTGTCAG CCTT GAGGAA GACAAATGGG CCAAACACAG AAAAAT CAT T AGTCCTGCCT 
601 TCCATGTCGA GAAGCTAAAG CATATGCTGC CTGCATTTTA TCTGAGTTGT AGTGAAATGA 
661 TAAGCAAATG GGAGGAGGTT GTT CCAAAAG AAACAT CATT CGAGCT C GAT GTATGGCCAG 
721 AC CTT CAAAT AATGACCAGT GAAGT CATTT CTCGCACTGC ATTTGGGAGT AGCTAT GAAG 
781 AAGGAAGAAT AGTATTTGAA CTT CAGAAAG AACAAGCTGA GTATGTAATG GACATAGGAC 
841 GTTCAATTTA TAT AC CAGGA TCAAGGTTCT TGCCTACTAA AAGGAACAAA AGAATGCTGG 
901 AAATTGAAAA GCAAGTGCAA ACAACAATTA GGCGTAT CAT CGACAAAAGA TTGAAGGCAA 
961 T GGAAGAAGG GGAGACTAGT AAAGAT GACT TATTAGGCAT ATTACTTGAA TCCAATTTGA 
1021 AAGAAATT GA ACTT CAT GGA AGAAATGACT TGGGAATAAC AACATCAGAA GT GAT T GAAG 
• 1081 AGT GCAAGTT AATCTATTTT GCCGGCCAAG AGACCACTTC AGTGTTGCTT GTTTGGACAA 
1141 TGATTTTGTT GT GCTTACAT CCAGAGT GGC AAGTACGTGC CAGAAAGGAA GTGTTGCAGA 
1201 CCTTTGGAAA T GATAAACCA GATTT GGAAG GACTAAGTCG CTTGAAAATT GTAACAAT GA 
1261 TCTTGTACGA GACGTTACGC CTATTCCCCC CATTACCAGC ATTT GGTAGA AGGAACAAAG 
1321 AAGAAGT CAA ATTAGGGGAG CTACAT CT AC CGGCTGGAGT GTTACTCGTT AT AC CAGCAA 
1381 TCTTAGTACA TTAT GATAAG GAAATAT GGG GTGAAGATGC AAAGGAATTC AAACCAGAAA 
1441 GATTCAGTGA AGGAGTGTCA AAGGCAACAA AT GGACAAGT CT CAT T TATA C CATT TAG CT 
1501 AGGGACCTCG TGTTTGCATT GGACAAAACT T CGCAAT GAT GGAAGCAAAA AT GGCAGTAA 
1561 CT AT GATACT ACAAAAATT C TCCTTTGAAC TATCCCCTTC TTATACACAT GCTCCATTTG 
1621 CAATTGTGAC TATT CAT CCC CAGTAT GGT G CTCCTCTGCT TAT GC GCAGA CTTTAAAACA 
1681 TATGTTGCTG ATATTTAAGA TCAGTGGCGT TTTATTCTCC ATG 

SEQ. ID. NO. 284 

1 MEMMYSIIAA ASIAIILVYT WKVLNWAWFG PKKMEKCLRQ RGLKGNPYKL LYGDLNELTK 
61 SIIEAKSKPI NFSDDIAQRL IPFFLDAINK NGKNS FVWLG PYPIVLITDP EHLKEIFTKN 
121 YVYQKQTHPN PYAKLLAHGL VSLEEDKWAK HRKIISPAFH VEKLKHMLPA FYLSCSEMIS 
181 KWEEWPKET SFELDVWPDL QIMTSEVISR TAFGSSYEEG RIVFELQKEQ AEYVMDIGRS 
241 IYIPGSRFLP T KRN KRMIiE I EKQVQTT I RR 1 1 DKRLKAME EGETSKDDLL GILLESNLKE 
301 IELHGRNDLG ITTSEVIEEC KLIYFAGQET TSVLLWTMI LLCLHPEWQV RARKEVLQTF 
361 GNDKPDLEGL SRLKIVTMIL YETLRLFPPL PAFGRRNKEE VKLGELHLPA GVLLVIPAIL 
421 VHYDKEIWGE DAKEFKPERF SEGVSKATNG QVSFIPFS 
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NAME D138-AD12 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. ^ 0 ^ T ^ CCTTTG CTCGTC ATTG ATGACGACTT CATTTTGTTT TCTTCCCCAC GAAAAT GGT A 
61 GAT AT GAT AT GGAGGGACGT AGGGAAGAAT T ACT GGGACA AACCTAGTGA GT GAAAAT GG 
121 AAACAGTTGA AATGATAGTA AAAGTAT CTT GTGCTGCCAT AGTAATTACT CTGTTGGTGT 
181 GTCTATGGAG AGTGCTGAAT TGGGTTTGGT T C AGAC C AAA GAAATTAGAG AAGTTGTTGA 
241 GAAAACAGGT TTTGTATGGG GACAT GAAAG AGTTTTCTGG GAT GATTAAG GAAGCATACT 
301 CAAAGCCTAT GAGTCTGTCT GAT GAT GT AG CACCACGAAT GATGCCTTTC TTTCTTGAAA 
361 C CAT CAAGAA AT AT G GAAAA AGATCCTTTA TATGGTTCGG TCCAAGACCA CTAGTATTGA 
421 T CAT GGAT C C T GAGCTTATA AAG GAAGT AC TCTCCAAAAT CTATCTTTAT CAAAAGCCCG 
4 81 GTGGAAATCC ATTAGCAACA CT ATT GGT AC AAGGATTAGC AACCTAT GAG GAAGACAAAT 
541 GGGCCAAACA TAGAAAAATC ATCAATCCCG CTTTCCATCT AGAGAAGCTA AAGCATATGC 
601 TTCCAGCTTT TCGCTTGAGC TGTAGTGAGA TGCTGAGCAA AT GGGAAGAC ATTGTTTCAG 
661 CT GAAGGCT C ACAT GAGAT A GATGTATGGC CTAACCTTGA GCAATT GAGT T GC GAT GT GA 
721 TCT CTCGGAC AGCTTTTGGC AATAGTTATG AAGAAGGTAG AAAGATTTTT GAACTTCAAA 

7 81 AGGAACAAAC T CAGCAT CTT GTGGAAGCTT TCCGCTCTGT TTATATCCCA GGAAGGAGAT 

8 41 TTTTGCCAAC AAAGAGGAAT AGAAGAATGA AGGAAATAAA AAAGGAGGTT CGAGCGTCAA 

9 01 TTAAAGGTAT T ATT GAT AAA AGATTGAAGG CAAT GAAAGC AGGGGACACC AATAAT GAGG 
961 ATCTATTGGG ATATTGCTGG AATCAAATTT TAAAGAAATT GAACAGCGCG GAAACAAGGA 

1021 TTTTGGAATG AGCATT GAAG ATGTCATTGA AGAAT GCAAG T T ATT CT AT T TTGCTGGCCA 
1081 AGAAACTACA TCAGTGTTGC TCCTATGGTC TCTAGTGTCG TTGAGCAGGT AT CAAGATT G 
1141 GCAGACAC GG GCCAGAGAAG AAGTCTTGCA TGTCTTTGGG AGT C GGAAAC CAGATTTTGA 
1201 TGAATTAAAT CAT CTAAAAG TTGTGACAAT GAT CAT GT AC GAGT CTT T AA GGCTAT AT CC 
1261 CT CACTAAT A ACACTTACCC GCCGGTGTAA TGAAGACATT GTATTAGGAG AACTATCTCT 
1321 AC C AGCT GGT GTCCTAGTCT CTTTGCCAAT GATTTTGTTG CAT CAT GAT G AAGAGAT AT G 
1381 GGGT GAAGAT GCAAAGGAGT TCAAACCAGA GAGATTTAGA GAAGGATT GT CAAGT GCAAC 
1441 AAAGGGTCAA CTTACATATT TTCCATTTGG CTGGGGTCCT AGAAT AT GT A TTGGACAAAA 
1501 TTTTGCCATG TTAGAAGCAA AGATGGCTCT GT CT AT GAT C CTGCAACGCT TCTCTTTTGA 
1561 ACTGTCTCCG TCTTATGCAC ATGCCCCTCA GT C CAT AT T A ACCGTT CAGC CT CAAT AT G G 
1621 TGCTCCACTT ATTTTCCACA AGCTATAATT T GGTACTT GT GAAAGGT GT C TTGTACAATA 
1681 TGTTAGTAGA GTTTATTCAG ACTTAGATAC ATGCTTC 



1 METVEMIVKV SCAAIVITLL VCLWRVLNWV WFRPKKLEKL LRKQVLYGDM KEFSGMIKEA 
61 YSKPMSLSDD VAPRMMPFFL ETIKKYGKRS FIWFGPRPLV LIMDPELIKE VLSKIYLYQK 
121 PGGNPLATLL VQGLATYEED KWAKHRKIIN PAFHLEKLKH MLPAFRLSCS EMLSKWEDIV 
181 SAEGSHEIDV WPNLEQLSCD VISRTAFGNS YEEGRKIFEL QKEQTQHLVE AFRSVYI PGR 
241 RFLPTKRNRR MKEIKKEVRA SIKGIIDKRL KAMKAGDTNN EDLLGYCWNQ ILKKLNSAET 
301 RILE 
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NAME D216-AG8 
ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 287 

1 CCAAAATGCA GTTCTTCAAC TTCATTTCCT TTGTCTTTTT TGTGTCTTTC CTCTTTTTAT 
61 TAAGGAAATG GAAGAACTCC AATAGC CAAA CCAAAAGATT GCCTCCAGGT CCAT GGAAAT 
121 TACCT GTACT T GGAAGC AT G TTTCATTTGC TAGGTGGACC T CCACAT CAT GTCCTTGGAG 
181 AT TTAGC CAA AAAATAT GGT CCACTTATGC ACCTTCAACT AGGT GAAGTT TCTGTAGTTT 
241 CTGTTACTTC TCCTGAGATG GCAAAAGAAG TACTAAAAAC TCATGACCTC GCTTTTGCAT 
301 CTAGGCCGTT ACTTTTGGCA GCCAAAATTG TCTGCTATAA TGGGACAGAC ATTGTCTTTT 
361 CCCCCTATGG CGATTATTGG AGACAAACGC GTAAAATTTG TCTCTTGGAA TTGCTCAGTG 
421 CCAAAAATGT TAGGTCATTC AGCTCAGTCA GACGAGATGA AGTTTTCCAT AT GATT GAAT 
481 TTTTTTCGAT CATCTTCTGG TAAGCCAGTT AATGTATCAA AAAGGATTTC TCTATTCACA 
541 AC CT CT AT GA CAT GT AGAT C AGCCTTTGGA CAAGAATACA AGGAGCAAGA CGAATT CGCA 
601 CAACTAGTAA AAAAAGT GT C AAGCTTAATG GAAGGGTTTG ATGTTGCTGA TATATTCCCT 
661 TCATTGAAGT TTCTTCATGT GCTCAGTGGA AT GAAGGCTA AAGTTATGGA TGCACACCAT 
721 GAGTTAGATG CCATTCTTGA AAAAAT TAT C AAT GAGCACA AGAAAATTGC AACT GGAAAG 
7 81 AAT AAT AAT G AATTAGGAGG T GAAGGAT T A ATT GACGTAC TGCTAAGACT TATGAAAGAG 
841 GGAGGCCTTC AATTCCCGAT CACCAACGAC AACATCAAAG CTATTATTTT TGACATGTTT 
901 GGT GCGGGAA CGGAAACTTC AT C AAC C AC A ATTGACTGGG C CAT GGT C GA AAT GAT AAAG 
961 AATCCAAGTG TATTCGCTAA AGCTCAAGCA GAGGTAAGAG AAGCCTTCAG AGAGAAAGAA 
1021 ACTTTTGATG AAAATGATGT CGAGGAGTT G AAATACTTAA AATT GGTTAT CAAAGAAACT 
1081 TTCAGACTCC ATCCTCCATT TCCCCTTTTG CTCCCAAGAG AAT CTAGAGA AGAAACAGAT 
1141 ATAAACGGCT ACACTATTCC TTTTAAAACA AAACTTATGG TTAACGTTCG GGCTATTGGA 
1201 AG AG AT C CAA AATATT GGGA TGACGTGGAA AGTTTTAAGC CAGAGAGATT TGAGCACAAC 

12 61 TCTATGGATT TTATTGGTAA TAATTTTGAA TATCTTCCCT TTGGTAGTGG AAGGAGAATG 
1321 TGCCCTGGGA TATCATTTGG TTT GGCTAAT GTTTATTTGC CACTAGCTCA AT T GTT AT AT 

13 81 CATTTTGATT GGAAACTCCC TACT GGAAT C AATTCAAGTG ACTT GGACAT GACTGAGTCG 
1441 TCAGGAGTAA CTTGTGCTAG AAAGAGT GAT TTATACTT GA CTGCTACTCC ATATCAACTT 
1501 T CT CAAGAGT GATGCAATGA TATCAACCTT TTGAATTTCG GTCAACCCCA CCAATAGTG 

SEQ. ID. NO. 288 

1 MQFFNFISFV FFVSFLFLLR KWKNSNSQTK RLPPGPWKLP VLGSMFHLLG GPPHHVLGDL 

61 AKKYGPLMHL QLGEVSWSV TSPEMAKEVL KTHDLAFASR PLLLAAKIVC YNGTDIVFSP 

121 YGDYWRQTRK ICLLELLSAK NVRSFSSVRR DEVFHMIEFF SIIFW 
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NAME D243-AB3 
ORGANISM NICOTIANA TABACUM 
SEQ . ID. NO. 2 89 

1 CCCCACCAAA AAATCATTTC TCTCGTCTAA AATGGAT CTT CTCTTACTAG AGAAGACCTT 
61 AATTGGTCTT TTCTTTGCCA TTTTAATCGC TTTAATTGTC TCTAAACTTC GTTCAAAGCG 
121 TTTTAAGCTT CCTCCAGGAC CAATTCCAGT ACCAGTTTTT GGTAATTGGC TTCAAGTTGG 
181 T GAT GATTTA AACCACAGAA AT CTTACT GA TTATGCCAAG AAATTTGGAG ATCTTTTCTT 
241 GTTAAGAATG GGTCAACGTA ACTTAGTTGT TGTGTCATCT CCT GAAT TAG CTAAAGAAGT 
301 TTTACACACA CAAGGTGTTG AATTTGGTTC AAGAACAAGA AATGTTGTGT TTGATATTTT 
361 TACTGGAAAA GGTCAAGATA TGGTTTTTAC TGTATATGGT GAACATT GGA GAAAAATGAG 
421 G AGAAT TAT G ACT GTACCAT TTTTTACTAA TAAAGTTGTG CAACAGTATA GAGGGGGGTG 
481 GGAGTTTGAG GTGGCAAGTG TAATT GAGGA TGTGAAAAAA AAT CCT GAAT CTGCTACTAA 
541 TGGGATCGTA TTAAGGAGGA GATTACAATT AATGATGTAT AAT AAT AT GT TTAGGATTAT 
601 GTTT GATAGG AGATTT GAGA GTGAAGATGA TCCTTTGTTT GTTAAGCTTA AGGCTTTGAA 
661 TGGTGAAAGG AGTAGATTGG CTCAAAGTTT T GAGT AT AAT TATGGTGATT TTATTCCAAT 
721 TTTGAGGCCT TTTTTTGAGA GGTTATTTGA AGATCTGTAA AGAAGTTAAG GAGAAGAGGC 
781 TGCAGCTTTT CAAAGATTAC TTTGTTGATG AAAGAAAGAA GCTTTCGAAT AC CAAGAGCT 
841 CGGACAGCAA TGCCCTAAAA TGTGCGATTG ATCACATTCT TGAGGCTCAA CAGAAGGGAG 
901 AGAT C AAT GA GGACAACGTT CTTTACATTG TTGAAAACAT CAATGTTGCT GCAATTGAAA 
961 CAAC AT TAT G GTCAATTGAG TGGGGTATCG CCGAGCTAGT CAACCACCCT CACAT CCAAA 
1021 AGAAACT GCG CGACGAGATT GACACAGTT C TTGGACCAGG AGTGCAAGTG ACTGAACCAG 
1081 ACACCCACAA GCTTCCATAC CTTCAGGCTG TGATCAAGGA GGCACTTCGT CTCCGTATGG 
1141 CAATTCCTCT ATTAGTCCCA CACAT GAACC TTCACGACGC AAAGCTT GGC GGGTTTGATA 
1201 TTCCAGCAGA GAGCAAAATC TTGGTTAACG CTTGGTGGTT AGCTAACAAC CCGGCTCATT 
1261 GGAAGAAACC CGAAGAGTTC AGACiC CGAGA GGTTCTTTGA AGAGGAGAAG CAT GT T GAGG 
1321 CCAAT GGCAA TGACTTCAGA TATCTTCCGT TTGGCGTTGG TAGGAGGAGC TGCCCTGGAA 
1381 TTATACTTGC ATTGCCAACT CTTGGCATCA CTTTGGGACG TTTGGTTCAG AACTTTGAGC 
1441 TGTTGCCTCC TCCAGGCCAG TCGAAGCTCG ACACCACAGA GAAAGGT GGA CAGTTCAGTC 
1501 TCCACATTTT GAAGCATTCC ACCATTGTGT TGAAACCAAG GTCTTTCTGA ACTTTGTGAT 
1561 CTTATTAATT AAGGGGTTCT GAAGAAATTT GATAGTGTTG G 

SEQ. ID. NO. 290 

1 MDLLLLEKTL IGLFFAILIA LIVSKLRSKR FKLPPGPIPV PVFGNWLQVG DDLNHRNLTD 

61 YAKKFGDLFL LRMGQRNLW VSSPELAKEV LHTQGVEFGS RTRNWFDIF TGKGQDMVFT 

121 VYGEHWRKMR RIMTVPFFTN KWQQYRGGW EFEVASVIED VKKNPESATN GIVLRRRLQL 

181 MMYNNMFRIM FDRRFESEDD PLFVKLKALN GERSRLAQSF EYNYGDFIPI LRPFFERLFE 
241 DL 
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NAME D250-AC11 
ORGANISM N I COT I ANA TABACUM 
SEQ. ID. NO. 291 

1 ATAATGCTCT TTCTACTCTT TGTAGCCCTT CCTTTCATTC TTATTTTTCT TCTTCCTAAA 
61 TTCAAAAATG GTGGAAATAA CAGATTGCCA CCAGGTCCTA TAGGTTTACC ATTCATTGGA 
121 AATTTGCATC AAT AT GAT AG TATAACTCCT CATATCTATT TTTGGAAACT TTCCAAAAAA 
181 TAT G GCAAAA TCTTCTCATT AAAACTTGCT TCTACTAATG TGGTAGTAGT TTCTTCAGCA 
241 AAATTAGCAA AAGAAGTATT GAAAAAACAA GATTTAATAT TTT GTAGTAG ACCATCTATT 
301 CTTGGCCAAC AAAAACTGTC T T ATT AT G GT C GT GAT AT T G CTTTTGCACC TTATAAT GAT 
361 TATT GGAGAG AAATGAGAAA AATTTGTGTT CTTCATCTTT TTAGTTTAAA AAAAGTTCAA 
421 TTATTTAGTC CAATTCGTGA AGATGAAGTT TTTAGAAT GA TTAAGAAAAT ATCAAAACAA 
481 GCTTCTACTT CACAAATTAT TAATTT GAGT AATTTAAT GA TTT CATTAAC AAGTACAATT 
541 ATTT GTAGAG TTGCTTTTGG TGTTAGGTTT GAAGAAGAAG CACAT GCAAG GAAGAGATTT 
601 GATTTTCTTT TGGCCGAGGC AC AAGAAAT G ATGGCTAGTT TCTTTGTATC TGATTTTTTT 
661 CCCTTTTTAA GTTAGAT T GA CAAATTAAGT GGATTGACAT ATAGACTT GA GAGGAATTTC 
721 AAGGATTTGG ATAATTTTTA T GAAGAACT C ATTGAGCAAC AT CAAAAT CC TAATAAGCCA 
781 AAATATATGG AAGGAGATAT TGTTGATCTT TTGCTACAAT T G AAGAAAG A GAAATTAACA 
841 CCACTTGATC T CAC TAT G GA AGATATAAAA GGAATTCTCA TGAATGTGTT AGTTGCAGGA 
901 TCAGACACTA GTGCAGCTGC TACTGTTTGG GCAAT G AC AG CCT T GATAAA GAAT CCTAAA 
961 GCCATGGAAA AAGTTCAATT AGAAATCAGA AAATCAGTTG GGAAGAAAGG CATT GTAAAT 
1021 GAAGAAGATG T C CAAAAC AT CCCTTATTTT AAAGCAGT GA TAAAGGAAAT ATTTAGATTG 
1081 TAT CCACCAG CTCCACTTTT AGTTCCAAGA GAATCAATGG AAAAAACCAT ATTAGAAGGT 
1141 TATGAAATTC GGC CAAGAAC CATAGTTCAT GTTAACGCTT GGGCTATAGC AAGGGAT CCT 
1201 GAAATAT GGG AAAAT C C AG A TGAATTTATA CCTGAGAGAT TTTT GAATAG CAGTATCGAT 
1261 TACAAGGGTC AAGATTTTGA GTTACTTCCA TTTGGTGCAG GCAGAAGAGG TTGCCCAGGT 
1321 ATTGCACTTG GGGTTGCATC CAT GGAACTT GCTTTGTCAA ATCTTCTTTA TGCATTTGAT 
1381 TGGGAGTTGC CTTAT GGAGT GAAAAAAGAA G ACAT C G AC A CAAACGTTAG GCCTGGAATT 
1441 GCCAT GCACA AGAAAAACGA ACTTTGCCTT GTCCCAAAAA AATTATTTAT AAATTATATT 
1501 GGGACGTGGA TCTCATGCTA GTTCTGTGCG GT CAGCTAAG CTTA 

SEQ. ID. NO. 292 

1 MLFLLFVALP FILIFLLPKF KNGGNNRLPP GPIGLPFIGN LHQYDSITPH IYFWKLSKKY 
61 GKIFSLKLAS TNWWSSAK LAKEVLKKQD LIFCSRPSIL GQQKLSYYGR DIAFAPYNDY 
121 WREMRKICVL HLFSLKKVQL FSPIREDEVF RMIKKISKQA STSQIINLSN LMISLTSTII 
181 CRVAFGVRFE EEAHARKRFD FLLAEAQEMM ASFFVSDFFP FLS . IDKLSG LTYRLERNFK 
241 DLDNFYEELI EQHQNPNKPK YMEGDIVDLL LQLKKEKLTP LDLTMEDIKG ILMNVLVAGS 
301 DTSAAATVWA MTALIKNPKA MEKVQLEIRK SVGKKGIVNE EDVQNIPYFK AVIKEIFRLY 
361 PPAPLLVPRE SMEKTILEGY EIRPRTIVHV NAWAIARDPE IWENPDEFIP ERFLNSSIDY 
421 KGQDFELLPF GAGRRGCPGI ALGVASMELA LSNLLYAFDW ELPYGVKKED IDTNVRPGIA 
481 MHKKNELCLV PKKLFINYIG TWISC 
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NAME D205-AH4 

ORGANISM NI COT I ANA TABACUM 

SEQ. ID. NO. 293 

1 GTGAGGTTTG AATCCTCTGC CTCAATGAAA CTCACCAAAT TGGTTTTCTA ATTTCCATCT 
61 AAAATATTGT CCAAAGCTAA AGATTCTTTC TCCTTAAATA GTCAACTTTA GTGGTTCCTC 
121 TTCATTTCAT AGCTCAATCT TTCTTATTTT GATTCAACCA TGGAGAACCA ATACT CCTAC 
181 . TCATTCTCTT CCTACTTCTA CTTAGCTATA GTACTGTTTC TTCTTCCAAT TTTGGTCAAA 
241 TATTTCTTCC AT C GGAGAAG AAATTTACCT C C AAGT C CAT TTTCTCTTCC AATAATT GGT 
301 CACCTTTACC TTCTCAAGAA AACTCTCCAT CTCACTCTAA CATCCTTATC AGCTAAATAT 
361 GGTCCTGTTT TATACCTCAA ATTGGGCTCT ATGCCTGTGA TTGTTGTGTC CTCACCATCT 
421 GCTGTTGAAG AAT GTTTAAC CAAGAATGAT AT CATATT CG CAAATAGGCC CAAGACCGTG 
481 GCTGGTGACA AGTTTACCTA CAATTATACT GTTTATGTTT GGGCACCCTA TGGCCAACTT 
541 TGGAGAATTC TTCGCCGATT AACTGTCGTT GAACTCTTCT CTTCACATAG C C T AC AGAAA 
601 ACTTCTATCC TTAGAGATCA AGAAGTT GCA ATATTTATCC GTTCGTTATA CAAATTCTCA 
661 AAGGATAGTA GCAAAAAAGT CGATTTGACC AACTGGTCTT TTACTTTGGT TTTCAATCTT 
721 AT GACCAAAA TTATTGCTGG GAGACATATT GTGAAGGAGG AAGATGCTGG CAAGGAAAAG 
7 81 GGCATTGAAA TTATTGAAAA ACTTAGAGGG ACTTTCTTAG TAACTACATC ATTCTTGAAT 
841 ATGTGTGATT TCTTGCCAGT ATTCAGGTGG GTTGGTTACA AAGGGCTGGA GAAGAAGATG 
901 GCCTCAATTC ACAATAGAAG AAAT GAATT C TTGAACAGCT TGCTTGATGA ATTTCGACAC 
961 AAGAAAAGTA GTGCTTCACA AT CTAACACA ACTGTTGGAA ACAT GGAGAA GAAAACCACA 
1021 CTGATTGAAA AGCTCTTGTC TCTTCAAGAA TCAGAGCCTG AATT CTACAC T GAT GAT AT C 
1081 AT C AAAAGT A TTATGCTGGT AGTTTTTGTT GCAGGAACAG AGAC CTCAT C AACAACCATC 
1141 CAAT GGGTAA TGAGGCTTCT TGTAGCTCAC CCT GAGGCAT TGTATAAGCT AC GAGCT GAC 
1201 ATT GACAGTA AAGTTGGGAA TAAGCGCTTG CTGAATGAAT CAGACCT CAA CAAGCTTCCG 
1261 TATTTGCATT GTGTTGTTAA TGAGACAATG AGATTATACA CTCCGATACC ACTTTTATTG 
1321 CCTCATTATT CAACTAAAGA TTGTATTGTG GAAGGATATG AT GT AC CAAA AC AT AC AAT G 
1381 TTGTTTGTCA ACGCTTGGGC CATTCACAGG GATCCCAAGG TAT GGGAGGA GCCTGACAAG 
1441 TTCAAGCCAG AGAGATTTGA GGCAACAGAA GGGGAAACAG AAAGGTT CAA TTACAAGCTT 
1501 GTACCATTTG GAATGGGGAG AAGAGCGTGC CCTGGAGCTG ATATGGGGTT GCGAGCAGTT 
1561 TCTTTGGCAT TAGGT GCACT T ATT CAAT GC TTTGACTGGC AAATTGAGGA AGCGGAAAGC 
1621 TTGGAGGAAA GCTATAATTC TAGAATGACT AT GCAGAACA AGCCTTTGAA GGTTGTCTGC 
1681 ACTCCACGCG AAGATCTTGG CCAGCTTCTA TCCCAACTCT AAGGCAATTT AT CAAT G C C A 
1741 AACGTAATCT T CAT CTAC CA CTATG 

SEQ. ID. NO. 294 

1 MENQYSYSFS SYFYLAIVLF LLPILVKYFF HRRRNLPPSP FSLPIIGHLY LLKKTLHLTL 

61 TSLSAKYGPV LYLKLGSMPV IWSSPSAVE ECLTKNDIIF ANRPKTVAGD KFTYNYTVYV 

121 WAPYGQLWRI LRRLTWELF SSHSLQKTSI LRDQEVAIFI RSLYKFSKDS SKKVDLTNWS 

181 FTLVFNLMTK IIAGRHIVKE EDAGKEKGIE IIEKLRGTFL VTTSFLNMCD FLPVFRWVGY 

241 KGLEKKMASI HNRRNEFLNS LLDEFRHKKS SASQSNTTVG NMEKKTTLIE KLLSLQESEP 

301 EFYTDDIIKS IMLWFVAGT ETSSTTIQWV MRLLVAHPEA LYKLRADIDS KVGNKRLLNE 

361 SDLNKLPYLH CWNETMRLY TPIPLLLPHY STKDCIVEGY DVPKHTMLFV NAWAIHRDPK 

421 VWEEPDKFKP ERFEATEGET ERFNYKLVPF GMGRRACPGA DMGLRAVSLA LGALIQCFDW 

481 QIEEAESLEE SYNSRMTMQN KPLKWCTPR EDLGQLLSQL 
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NAME D267-AF10 
ORGANISM NICOTIANA TABACUM 

SEQ. ID. NO. ^J CCTTT CCTTCTTCCA AAAATGGAGC TTCAATCTTC TCCTTTCAAT TTAATTTCTT 
61 TGTTCCTCTT CTTTTCTTTT CTTTTTATTC TAGTGAAGAA AT GGAAT GCC AAAAT CCCAA 
121 AGTTACCTCC AGGTCCGTGG AGGCTTCCCT TTATT GGAAG CCTCCATCAC TT GAAGGGAA 
181 AACTTCCACA C CATAAT CTT AGAGATCTAG CGCGAAAATA TGGACCTCTC ATGTACTTAC 
241 AACTCGGAGA AATTCCTGTA GTTGTAATAT CTTCGCCACG TGTAGCAAAA GCT GT ACTAA 
301 AAACT CAT GA TCTCGCTTTT GCAACTAGAC CACGATTCAT GTCCTCAGAC ATT GTGTTTT 

3 61 ACAAAAGCAG GGACATCTCT TTTGCCCCAT TTGGTGATTA CTGGAGACAG AT GCGTAAAA 
421 TATTGACTCA GGAACTCCTG AGCAACAAGA TGCTCAAGTC ATATAGCTTA ATCCGAAAGG 

4 81 ATGAGCTCTC GAAGCTCCTC T CAT C GATT C GTTTGGAAAC AGGTTCTGCA GTGAACATAA 
541 ATGAAAAGCT TCTCTGGTTT ACGAGCTGCA TGACCTGTAG ATTAGCCTTT GGAAAAATAT 
601 GCAATGATCG GGATGAGTTG AT CAT GCTAA TTAGGGAGAT ATTAACATTA TCAGGAGGAT 
661 TTGATGTGGG TGATTTGTTC CCTTCCTGGA AATTACTTCA TAATATGAGC AACATGAAAG 
721 CTAGGTTGAC GAATGTACAC CACAAGTATG ATTTAGTTAT GGAGAACATC ATCAATGAGC 

7 81 ACCAAGAGAA TCATGCAGCA GGGATAAAGG GTAACAACGA GTTTGGTGGC GAAGAT AT GA 

8 41 TCGATGCTCT ACT GAGGGCT AAGGAGAATA ATGAGCTTCA ATTTCCTATC GAAAAT GACA 

9 01 ACAT GAAAGC AGTAATTCTG GACTTGTTTA TTGCTGGAAC TGAAACTTCA TAT ACT GCAA 
9 61 TTATAT GGGC ACTAT CAGAA TTGATGAAGC ACCCAAGTGT GATGGCCAAG GCACAAGCTG 

1021 AAGT GAGAAA AGTCTTCAAA GAAAAT GAAA ATTTCGACGA AAATGATCTT GACAAGTTGC 
10 81 CATACCTAAA ATCAGTGATT AAAGAAACAC TAAGGATGCA CCCTCCAGTT CCTTTGTTAG 
1141 GGCCTAGAGA ATGCAGGGAC CAAACAGAGA TCGATGGCTA CACT GTACCT ATTAAAGCTA 
12 01 GAGTTATGGT TAATGCTTGG GCGATAGGAA GAGATCCTGA AAGTTGGGAA GATCCTGAAA 

12 61 GTTTCAAACC GGAGCGATTT GAAAATACTT CTGTTGATCT TACAGGAAAT CACTATCAGT 
1321 TCATTCCTTT CGGTTCAGGA AGAAGAATGT GT CCAGGAAT GTCGTTT GGT TTAGTTAACA 

13 81 CAGGGCATCC TTTAGCCCAG TTGCTCTATT GCTTTGACTG GAAACTCCCT GACAAGGTTA 

14 41 AT GCAAAT GA TTTTCGCACT ACTGAAACAA GTAGAGTTTT TGCAGCAAGC AAAGATGACC 

15 01 TCTACTTGAT TCCCACAAAT CACAGGGAGC AAGAATAGCT TAATTTAATG GAGTTCTTGG 
15 61 AAGAATTAAA GAAGAAGGGC TATATAGGTG AGATTTTTTG TATGGTTGCA AGGTTTTTAG 
1621 T T CAT ACAAT AAGACAATAC ATTATATTCC AGTATTGTGT AT CAT GT AT A ATAAGGTTCC 

• 16 81 TTTTGTTTAA AAAA 

SEQ. ID. ^ E ^3 SpFNL xSLFLFFS FL F I LVKKWNAK IPKLPPGPWR LPFIGSLHHL KGKLPHHNLR 
61 DLARKYGPLM YLQLGEIPW VISSPRVAKA VLKTHDLAFA TRPRFMSSDI VFYKSRDISF 
121 APFGDYWRQM RKILTQELLS NKMLKSYSLI RKDELSKLLS SIRLETGSAV NINEKLLWFT 
181 SCMTCRLAFG KICNDRDELI MLIREILTLS GGFDVGDLFP SWKLLHNMSN MKARLTNVHH 
241 KYDLVMENI I NEHQENHAAG IKGNNEFGGE DMIDALLRAK ENNELQFPIE NDNMKAVILD 
3 01 LFIAGTETSY TAIIWALSEL MKHPSVMAKA QAEVRKVFKE NENFDENDLD KLPYLKSVIK 

3 61 ETLRMHPPVP LLGPRECRDQ TEIDGYTVPI KARVMVNAWA IGRDPESWED PESFKPERFE 
421 NTSVDLTGNH YQFIPFGSGR RMCPGMSFGL VNTGHPLAQL LYCFDWKLPD KVNANDFRTT 

4 81 ETSRVFAASK DDLYLIPTNH REQE 
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NAME D284-AH5 
ORGANISM NI COT I ANA TABACUM 
SEQ. ID. NO. 297 

1 CAATCAGTGG ATGCGGGAGT AATATATAAT ATGCAAGTTG TAGAAAGAGA AAAAAAAAAT 
61 CAAGTAGCTA TTCTATACTG GGGCACAAAT AGTGAGTGAA AAT GGAGACT GTTCAAATCA 
121 TAATAACAGC ATCTTGTGCT GC CATAATAA TTACTCTAGT GGTGTGTATT TGGAGAGTAC 
181 TGAATTGGGT TTGGTTCAGA CCAAAGAAGC TGGAAAAACT ATTGAGGAAA CAAGGTCTCA 
241 AAGGCAACTC CTACAAGATT TTGTATGGGG AT AT GAAGGA GCTTTCTGGT AT GATTAAGG 
301 AAGCTAATTC CAAACCCATG AATCTTTCTG ATGATATTGC ACCAAGATTG GTGCCTTTCT 

3 61 TTCTTGACAC CAT C AAGAAA TAT GGTAAAA AATCCTTTGT ATGGTTAGGT CCGAAACCAC 
421 TGGTTCTTAT CATGGACCCT GAGCTTATAA AGGAAATATT TTCCAAATAC TATCTGTATC 
481 AAAAGCCTCA TGGAAATCCA GTTAC CAAGC TATTAGTACA AGGACTAGTA AGC CTAGAGG 
541 AAGACAAATG GGCCAAACAT AGAAAAATCA TCAATCCAGC TTTCCATCTA GAGAAGCTAA 
601 AGCATAT GCT TCCAGCTTTT TGCTTGAGCT GCACT GAGAT GCTGTGCAAA TGGGAAGATA 
661 TTGTTTCAAT TAAGGGCTCA CATGAGATAG ATGTATGGCC TCACCTTGAA CAATTAAGTA 
721 GCGATGTGAT CTCTCGGACA GCTTTTGGCA , GTAACTTTGA AGAAG GT AAA AGGATATTTG 
7 81 AACTTCAGAA GGAACAAGCT CAGTATTTTG TAGAAGCTAT ACGCTCGGTT TAT AT AC CAG 
841 GCT GGAGGTT TTTGCCAACA AAGAGGAACA GAAGAATGAA GGAAGTT GAA AAGGATGTTC 
901 GGGCCTCGAT AAGAGGCATT ATT GATAAAA GAGT GAAGGC AATGAAAGCA GGAGAGGCGA 
961 GTAAT GAGGA TCTACTTGGT ATATTGTTGG AAT CTAATTT TACAGAAGCT GAACAGCATA 

1021 GACACAAGGA TTCTGCGATG AGCATT GAAG AAGTCATTCA AGAAT GCAAG TTATTCTATG 
1081 TTGCTGGCCA AGAAACTACA TCAGTGTTGC TTGTGTGGAC TCTAATATTG TTGAGTAGGC 
1141 AT CAAGATT G GCAGAGCCGA GC CAGAGAAG AGGTGTTTCA AGTCTTTGGT AATCAGAAAC 
1201 CAGATTTTGA CGGATTGAAT CGT CTAAAAG TTGTGACAAT GATCTTGTAT GAGT CTTTAA 
12 61 GGCTATACTC CCCAGTAGTG TCACTAATCC GGCGGCCTAA T GAGGAT GCT ATATTAGGAA 
1321 ATGTATCTCT GC CAGAAGGT GTGCTACTCT CATTACCAGT GATCTTATTA CACCACGATG 
1381 AAGAGAT AT G GGGTAAAGAT GCAAAGAAGT TCAATCCAGA AAGAT TTAGA GATGGAGTCT 
1441 CAAGT GCAAC AAAGGGTCAA GTCACTTTTT TTCCATTTAC TTGGGGTCCC AGAAT AT GCA 
1501 TCGGACAAAA TTTTGCCATG T T AGAAG C AA AGACTGCTTT GGCTAT GAT C CTACAACGCT 
1561 TCTCATTCGA ACTGTCTCCA TCTTATGCAC ATGCTCCTCA GT CCATATTA ACTAT GCAAC 
1621 CCCAACATGG TGCTCCACTA ATTCTGCACA AAATATAGTT TGTTACTTTA AGCAGTGTCT 
1681 T GTTATAT GT CAGAGAGTCC AAAATGTTTA ATTAAGGCTT GTAGAACTGC CAAAT GGAAC 
1741 TTCATTTGCA TTCGTGGGTT GTAGATT GTT GTAATT GGAC AAGTATACTG TTTATTTTAG 
18 01 AGTTTTAAGA AAAAAAAAA 

SEQ. ID. NO. 298 

1 METVQIIITA SCAAIIITLV VCIWRVLNWV WFRPKKLEKL LRKQGLKGNS YKILYGDMKE 
61 LSGMI KEANS KPMNLSDDIA PRLVPFFLDT I KKYGKKS FV WLGPKPLVLI MDP ELI KEIF 
121 SKYYLYQKPH GNPVTKLLVQ GLVSLEEDKW AKHRKI INPA FHLEKLKHML PAFCLSCTEM 
181 LCKWEDIVSI KGSHEIDVWP HLEQLSSDVI SRTAFGSNFE EGKRI FELQK EQAQYFVEAI 
241 RSVYIPGWRF LPTKRNRRMK EVEKDVRAS I RGIIDKRVKA MKAGEASNED LLGILLESNF 
301 TEAEQHRHKD SAMSIEEVIQ ECKLFYVAGQ ETTSVLLVWT LILLSRHQDW QSRAREEVFQ 
361 VFGNQKPDFD GLNRLKWTM ILYESLRLYS PWSLIRRPN EDAILGNVSL PEGVLLSLPV 
421 ILLHHDEEIW GKDAKKFNPE RFRDGVS SAT KGQVTFFPFT WGPRICIGQN FAMLEAKTAL 

4 81 AMILQRFSFE LSPSYAHAPQ SILTMQPQHG APLILHKI 
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Figure 150: Amino Acid Identity of Group Members 



Group 1 

AQIAINLWSMLGHLLHHFTWAPAPGVNPE2)IDLEESPGTVTYMKNPIQAIPTPRLPAHLYGRVPVDM 

I 

AQLAINLVTSMLGHLLHHFTWAPPPGWPENIDLEESPGTVTYMKNPIQAIPTPRLPAHLYGRVPVDM 



SEQ ID No. :2 D58-BG7 
(98.5) 

SEQ ID No. :4 D58-AB1 



Group 2 

QLAINLVTSMLGHLFI I LHGLRPRGLTRRILTWRRALEQ 



SEQ ID No. :8 D58-BE4 



Group 3 

EGIiAVRMVALSLGCIIQCFDWQRIGEELVDMTEGTGLTLPKAQPLVAKCSPRPKMANLLSQI 

f II I 

EGLAIRMVALSLGCIIQCFDWQRLGEGLVDKTEGTGLTLPKAQPLVAKCSPRPIMANLLSQI 



SEQ ID No.: 10 D56-AH7 
(93.5) 

SEQ ID No. :12 D13a-5 



Group 4 

IGFATLVTHLTFGRLLQGFDFSKPSNTPIDMTEGVGVTLPKVNQVEVLITPRLPSKLYLF 

I I II 

INFATLVTHLTFGRLLQGFDFSTPSNTPIDMTEGVGVTLPKVNQVEVLISPRLPSKLYVF 



SEQ ID No.: 14 D56-AG1G 
(93.3) 

SEQ ID No.:18 D34-62 



Group 5 

1 1 LALP I LGI TLGRIjVQN FELLP P P GQS KLDTTEKGGQFS LHI LKHS T I VLKPRS F 

I 

I I LALP I LGI TLGRLVQN FELLPP PGQS KLDTTEKGGQFS LH I LKHS T IVMKPRSF 

I I 

IILALPILGITLGRLVQNFELLPPPGQSKLDTTEKGGQFSLHILKHSTIVLKPRSC 



SEQ ID No„:20 D56-AA7 
(98.2) 

SEQ ID No.: 144 D185-BD3 
(96.4) 

SEQ ID No. :22 D56-AE1 



Group 6 

IALGVASMELALSNLLYAFDWELPFGMKKEDIDTNARPGITMHKKNELYLIPKNYL SEQ ID No.: 24 D35-B37 

III I (92.8) 

IALGVASMELALSNLLYAFDWELPYGVKKENIDTNVRPGITMHKKNELCLIPRNYL SEQ ID No.: 26 D177-BA7 

I I (96.4) 

IALGVASMELALSNLLYAFDWELPYGVKKEDIDTNVRPGIAMHKKNELCLVPKNYL SEQ ID No . : 2 8 D56A-AB6 

III (94.6) 
IALGVASMELALSNLLYAFDWELP YGVKKEDIDTNVRPGIAMHKKNELCLVPKKLFINYI GTWI SC SEQ ID No.: 30 D144-AE2 



Group 7 

ISFGLANAYLPIAQLLYHFDWELPTGIKPSDLDLTELVGVTAARKSDLYLVATPYQPPQH 

III I 

ISFGLANAYLPLAQLLYHFDWKLPAGIEPSDLDLTELVGVTAARKSDLYLVATPYQPPQK 



SEQ ID No.: 32 D56-AG11 
(93.3) 

SEQ ID No.: 34 D179-AA1 



Group 8 

MLFGLANVGQPLAQLLYHFDWKLPNGQSHENFDMTESPGISATRKDDLVLIATPYDSY 

II III 
MLFGLANVGQPLAQLLYHFDWKLPNGQTHQNFDMTESPGISATRKDDLILIATPAHS 



SEQ ID No.:36 D56-AC7 
(91.2) 

SEQ ID No.: 38 D144-AD1 
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Group 9 

LLFGLVNVGHPLAQLLYHFDWKTLPGISSDSFDMTETDGVTAGRKDDLCLIATPFGLN 



SEQ ID No.: 40 D144-AB5 



Group 10 

MSFGLVNTGHPLAQLLYrFDWKFPHKVNAADFHTTETSRVFAASKDDLYLIPTNHMEQE 

I I I I I I 

MS FGLVNTGHPLAQLLYC FDWKLPDKVNANDFRTTET S RVFAAS KDDLYLI PTNHREQE 



SEQ ID No.: 42 D181-AB5 
(89.8) 

SEQ ID No.: 44 D73-Ac9 



Group 11 

MQ FGLALVT L P LAHL LHN FDW KLP EG INARDL DMTEAN G I S ARREKDIi YL I AT P YVS PLD 



SEQ ID No.: 46 D56-AC12 



Group 12 

MTYALQVEHLTMAHLIQGFNYRTPTDEPLDMKEGAGITIRKWPVKVIITPRLAPELY 

ill II I 

MTYALQVEHLTMAHLIQGFNYKTPNDE70iDMKEGAGITIRKVNPVELIIAPRLAPELY 

I 

MTYALQVEHLTMAHLIQGFNYKTPNDEALDMKEGAGITIRKVNPVELIITPRLAPELY 

II I 

MTYALQVEHLTMAHLIQGFNYRTPNDEPLDMKEGAGITIRKWPVELIIAPRLAPELY 

I 

MTYALQVEHLTMAHIilQGFNYRTPNDEPLDMKEGAGITIRKVNPVELIIAP-LAPELY 

I 

MTYALQVEHLTMMLIQGFNYRTPNDEPLDMKEGAGITIRKVNPAELIIAPRLAPELY 

I I I I Mill 

mtyalqvehltiahliqg™yktpndepldmkegagi*tirkvnpvevtttarlapely 

I 

mtyalqvehltiahliqgfnyktpndepldmkegagltirkwpvevtitarlapely 



SEQ ID No.:48 D58-AB9 
(89.6) 

SEQ ID No.:50 D56-AG9 
(98.2) 

SEQ ID No.:52 D56-AG6 
(94. 8) 

SEQ ID No.: 54 D35-BG11 
(98.3) 

SEQ ID No. : 56 D35-42 
(98.3) 

SEQ ID No.:58 D35-BA3 
(84.5) 

SEQ ID No. : 60 D34-57 
(98.3) 

SEQ ID No. : 62 D34-52 



Group 13 

YS LGLKVI RVTLANMLHG FNWKLP EGMKP EDI S VEEHYGLTTH PKFPVPVI LES RLS S DLYS PIT SEQ ID No.: 66 D56-AD10 
Group 14 

YSLGIRIIRATI^LLHGFNWRLPNGMSPEDISMEEIYGLITHPIWALDVmEPRLPNHLYK SEQ ID No.: 68 D56-AA11 



Group 15 

INFSIPLVELALANLLFHYNWSLPEGMLAKDVDMEEALGITMHKKSPLCLVASHYTC 

I M 

INFSIPLVEIALANLLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL 



SEQ ID No.:70 D177-BD5 
(94.7) 

SEQ ID No.: 84 D177-BD7 



Group 16 

MQLGLYALEMAVAHLLLCFTWELPDGMKPSELKMDDIFGLTAPRANRLVAVPSPRLLCPLY 

I I 

MQLGLYALEMAVAHLLHCFTWEL P DGMKP SELKMDD I FGLTAPRANRLVAVPTPRLLC PLY 

I 

MQLGLYALEMAVAHLLHCFTWELPDGMKPSELKMDDIFGLTAPKANRLVAVPTPRLLCPLY 



SEQ ID No-: 74 D58-BC5 
(96.7) 

SEQ ID No. =76 D58-AD12 
(98.4) 

SEQ ID No.:72 D56A-AG10 



Group 17 

HLWSAS I VRVS YLTCI YRFQVYAGSVFRVA 



SEQ ID No.:78 D56-AC11 
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MLWSAS IVRVS YLTCI YRFQVYAGSVSRVA 



(96.7) 

SEQ ID No.: 88 D5 6-AD6F 



Group 16 

LNFAMLEAKMAIALILQHYAFELSPSYAHAPHTIITLQPQHGAPLILRKL 



SEQ ID No.:90 D73A-AD6 



Group 19 

QNFAILEAKMAIAMILQRFS FELSPSYTHSPYXWTLKPKYGAPLIMHRL 

I I I I 1 I I I I I I I II 

QNFAMLEAKMALSMI LQRFS FELS P S YAHAPQSILTVQPQYGAPLI FHKL 

I I I I till I 

INFAMTEAKMAMAMI LQRFS FELS PS YTHAPQSVI TMQPO YGAPLI LHKL 

I 

IMFAMAEAKMAMAMILQRFSFELSPSYTHAPQSVITMQPQYGAPLILHKL 

I I i II I I I I II I I I I 

QMFAMMEAKMAVAMI LHKFS FELSPS YTFAPFAIVTIHPQYGAPLLMRJRL 

I 

QNFAMMEAKMAVAMI LQKFSFELSPSYTHAPFAIVTIHPQYGAPLLMRRL 



Group 20 

QNFAMLEAKMAMAMILKTYAFELSPSYAHAPHPLLLQPQYGAQLILYKL 





1 ID No 


. :96 


D7 OA- ABB 










(72.0) 


SEQ 


ID 


No . 


:100 


D70A-AB8 










(82.0) 


SEQ 


ID 


No. 


:102 


D70A-BH2 










(98.0) 


SEQ 


ID 


No. 


:104 


D70A-AA4 










(70.0) 


SEQ 


ID 


No. 


:108 


D70A-BA9 










(98.0) 


SEQ 


ID 


No. 


:106 


D70A-BA1 


SEQ 


ID 


No. 


:110 


D70A-BD4 



Group 21 

YSMGLKAIQAS LANLLHGFNWS LPDNMTPEDLNMDEI FGLSTPKKFPLATVI EPRLS FKLYS V 
1 I 

YSLGLKEIQAS LANIiLHGFNWSLPDNMTPEDLNMDEI FGLSTPKKFPLATVI EPRLSPKLYSV 
I I 

HSLGLKVIQASLANLLHGFNWSLPDNKTPEDLNMDEIFGLSTPKKFPLATVIEPRLSPKLYSV 



SEQ ID No.: 112 D181-AC5 
(96.8) 

SEQ ID No.: 114 D144-AH1 
(96.8} 

SEQ ID No. : 116 D34-65 



Group 22 

LCFPCLI SS YI LAXNVNLYHNFLQI S PS I S Y 



SEQ ID No.: 118 D35-BG2 



Group 23 

SGLAQCVVGLAIATLVQCFEWKRVSEEVVDLTE 



SEQ ID No.:120 D73A-AH7 



Group 24 

LGLATVHVNLMLARMI QE FEW SAY P ENRKVDLLRNWNLLW 

MM II 111 

LGLATVHVNLMLARMI QE FEW S AY P ENRKVDFT E KLEFT WMKN P LRAKVKPRMQW 
I 

LGLATVHVNLMLARTI QEFEWSAYP ENRKVDFTEXLEFTVVMKNP LRAKTVKP RMQW 



SEQ ID No.:136 D185-BG2 
(77.5) 

SEQ ID No.: 122 D58-AA1 
(98.2) 

SEQ ID No.:134 D185-BC1 
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Group 25 

YALAMLHLE YFVANLVWH FRWEAVEGDDVDLS EKLE FT WMKNP LRARI CP RVNS I 



SEQ ID No.:124 D73A-AE10 



Group 26 

QQVGLLRTTIFIASLLSEYKIiKPRSHQKQVELTDLNPASWLHSIKGELLVDAIPRKKAAF 



SEQ ID No.:126 D56A-AC12 



Group 27 

ITFAKFVNEIAIARLMFHFDFSLPKGVKHEDLDVEEAAGITVRRKFPLLAVATPCS 

I 

I T FAK FVNEIiALARLMFH FDFS L P KGVKHADLDVEEAAGI T VRRKFP LLAVATPCS 



SEQ ID No.:128 D177-BF7 
(98,2} 

SEQ ID No.:140 D185-BD2 



Group 28 

QRYAINHLMLFIALFTALIDFKRHKTDGCDDIAYIPTIAPKDDCKVFLSQRCTRFPSFS 



SEQ ID No.: 130 D73A-AG3 



Group 29 

MSFGLANLYLPLAQLLYHFDWKLPTGIKPRDLDLTELSGITIARKGDLYLNATPYQPSRE 
I Ml I I I I I I II 

ISFGLANVYLPLAQLLYHFDWKLPTGINSSDLDbiTESSGVTCARKSDLYLTATPYQLSQE 



SEQ ID No . : 132 D70A-AA12 
(80.0) 

SEQ ID No. :86 176-BF2 



Group 30 SEQ ID No.: 146 D176-BC3 

QNFAMLEAKTTL^ILQRFSFELSPSYAHAPQSIITCNPSMVLHLFCIKYSLLLVSSVSFYVKHE^ 



Group 31 

ADMGLRAVSLALGALIQCFDWQIEEAESLEESYNSRMTMQNKPLKVVCTPREDLGQLLSQL 



SEQ ID No.: 148 D176-BB3 



Group 32 

MNYSLQVEHLSIAHMIQGFSFATTTNEPLDMKQGVGLTLPKKTDVEVLITPRLPPTLYQY SEQ ID No . : 6 D18 6-AH4 



The percentage Identity between most related pairs is noted in (0.0%) . Each group had at 
least 7 0% identity to another group member. Group 19 contained the lowest percentage 
identity at 7 0-0%. 
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, FIGURE 151: COMPARISON OF SEQUENCE GROUPS 

ALIGNMENT OF GROUP 1 

D5B-BG7 GCACAACTTGCTATCAACTTGGTCACATCTATGTTGGGTCATTTGTTGCATCATTTTACA SEQ ID No 1 

I 

D5B-AB1 GCACAACTTGCTATCAACTTGGTCACATCTATGTTGGGTCATTTGTTGCATCATTTTACG SEQ ID No 3 

I 

D58-BE4 GCACAACTTGCTATCAACTTGGTCACATCTATGTTGGGTCATTTGTT-CATCATTTTACA SEQ ID No 7 

*************** ************************** ****** ************ 

D58-BG7 TGGGCTCCGGCCCCGGGGGTTAACCCGGAGGATATTGACTTGGAGGAGAGCCCTGGAACA 

I I 
D5 8 -AB1 TGGGCTCCGCCCCCGGGGGTTAACCCGGAGAATATTGACTTGGAGGAGAGCCCTGGAACA 

I I 
D58-BE4 TGGGCTCCGGCCCCGGGGGTTAACCCGGAGGATATTGACTTGGAGGAGAGCCCTGGAACA 

********* ******************** ***************************** 



D5 8 -BG7 GTAACT T AC AT GAAAAATCCAATAC AAGCTAT TCCAACTCCAAGAT TGCCTGCACACT TG 

I 

D5 8 -AB1. GT AAC TTACAT GAAAAATCCAATACAAGCTAT TCCT ACTCCAAGAT TGCCTGCACACT TG] 

1 I I I I I I ! I I I I I I I ) I I I I I I I I I 1 I I I I I I I II I I I I I I I I I I I I 
D58-BE4 GT AAC TTACAT GA 



D58-BG7 
D58-AB1 
D58-BE4 

PERCENT IDENTITY OF GROUP 1 
D58-BG7 DS8-BE4 



TAT GGAC GTGTGCC AGTGGAT AT GT AA 
TAT GGAC GTGT GC C AGT GGAT ATGTAA 

iiiiiiiiiiiiiiiiimmni) 



D58-AB1 



D58-BG7 *** 

D58-BE4 

D58-AB1 



96.2 
+ ** 



98.1 
94.0 
*** 



SEQ ID No 1 
SEQ ID NO 7 
SEQ ID No 3 



ALIGNMENT OF GROUP 2 

D56-AH7 
D13a-5 



GAAGGATTGGCTGTTCGAATGGTTGCCTTGTCATTGGGATGTATTATTCAATGTTTTGAT SEQ ID No 9 

I I I 

GAAGGATTGGCTATTCGAATGGTTGCATTGTCATTGGGATGTATTATTCAATGCTTTGAT SEQ ID No 11 
************ ************* ************************** ****** 



D56-AH7 
D13a-5 



TGGCAACGAATCGGCGAAGAATTGGTTGATATGACTGAAGGAACTGGACTTACTTTGCCT 

I I I I I 

TGGCAACGACTTGGGGAAGGATTGGTTGATAAGACTGAAGGAACTGGACTTACTTTGCCT 
********* * ** **** *********** **************************** 



D56-AH7 
D13a-5 



AAAGCTCAACCTTTGGTGGCCAAGTGTAGCCCACGACCTAAAATGGCTAATCTTCTCTCT 

i I ) 

AAAGCTCAACCTTTAGTGGCCAAGTGTAGCCCACGACCTATAATGGCTAATCTTCTTTCT 
************** ************************* *************** *** 



D56-AH7 



CAGATTTGA 



D13a-5 CAGATTTGA 
********* 

PERCENT IDENTITY OF GROUP 2 



D56-AH7 D13a-5 

D56-AH7 *** 93.7 SEQ ID No 9 

D13a-5 *** SEQ ID No 11 



WO 2004/035745 



96/111 



PCT/US2003/032722 



FIGURE 151: COMPARISON OF SEQUENCE GROUPS 



ALIGNMENT OF GROUP 3 

D56-AG10 ATAGGTTTTGCGACTTTAGTGACACATCTGACTTTTGGTCGCTTGCTTCAAGGTTTTGAT SEQ ID 1 

D35-33 ATAGGCTTTGCGACTTTAGTGACACATCrGACTTTTGGTCGCTTGCTTCAAGGTTTTGAT SEQ ID ] 

III 

D34-62 ATAAATTTTGCGACTTTAGTGACACATCTGACTTTTGGTCGCTTGCTTCAAGGTTTTGAT SEQ ID 1 

*** ****************************************************** 



D56-AG10 

D35-33 

D34-62 

D56-AG10 



TTTAGTAAGCCATCAAACACGCCAATTGACATGACAGAAGGCGTAGGCGTTACTTTGCCT 

T TT AGTAAGCC ATCAAACACGCCAATTGACAT GACAGAAGGC GTAGGCGTTACT TT GCCT 

) . I 
T TTAGT ACGCC ATCAAACACGCC AATAGACATGACAGAAGGC GTAGGC GTT ACTTT GCCT 
******* ****************** ********************************* 



AAGGTTAATCAAGTrGAAGTTCTAATTACCCCTCGTTTACCTTCTAAGCTTTATTTATTTTGA 

D35 - 33 AAGGTTAATCAAGTTGAAGTTCTAATTACCCCTCGTTTACCTTCTAAGCTTTATTTAT-- 

It I I II 

AAGGTAAATCAAGTGGAAGTTCTAATTAGCCCTCGTTTACCTTCTAAGCTTTATGTATTCTGA 
***** ******** ************* ************************* *** 



D34-62 



PERCENT IDENTITY OF GROUP 3 

D56-AG10 D35-33 D34-62 

D56-AG10 *** 98.9 95.1 SEQ ID No 13 

D35-33 *** 94.4 SEQ ID No 15 

D34-62 *** SEQ ID No 17 

ALIGNMENT OF GROUP 4 

D56-AA7 



D56-AE1 



ATTATACTTGCATTGCCAATTCTTGGCATCACTTTGGGACGTTTGGTTCAGAACTTTGAG 

I 

ATTATACTTGCATTGCCAATTCTTGGCATTACTTTGGGACGTTTGGTTCAGAACTTTGAG 
I I I I I 

Dl 8 5-BD3 ATTATCCTTGCACTGCCAATTCTTGGCATTACCTTGGGACGCTTGGTGCAGAACTTTGAG 
***** ****** **************** ** ******** ***** ************ 

D 5 6 -AA7 C TGTT GCCTCC TCCAGGCCAGTC GAAGCTCGACACC AC AGAGAAAGGT GG AC AGT TCAGT 

D5 6-AE1 C TGTT GCC TCC TCC AGGCC AGTCGAAGCTCGACACC AC AGAGAAAGGT GGAC AGTTCAGT 

i I I i I If 

D 1 B 5-BD3 TTGTrGCCTCCTCCAGGACAGTCAAAGCTTGACACAACAGAGAAAGGC GGGCAATTCAGT 

**************** ***** ***** ***** *********** ** ** ****** 

D5 6-AA7 CTCCACATTTTGAAGCATTCCACCATTGTGTTGAAACCAAGGTCTTTCTGA 
I 

D 5 6 -AE1 CTCCATATTTT GAAGCATTCC ACCATT GTGTT GAAACC AAGGTC TTGC TGA 

I i ) i mi 

DIB 5-BD3 CTGCACATTTTGAAGCATTCCACCATTGTGATGAAACCAAGATCTTTTTAA 
** ** ************************ ********** **** * * 

PERCENT IDENTITY OF GROUP 4 

D56AA7 D56-AE1 D195-BD3 

D56AA7 *** 98.2 87.7 SEQ ID No 19 

D56-AE1 *** 87.1 SEQ ID No 21 

D185-BD3 *** SEQ ID No 143 



WO 2004/035745 



97/111 



PCT/US2003/032722 



FIGURE 151: COMPARISON OF SEQUENCE GROUPS 



ALIGNMENT OF GROUP 5 



D56A-AB6 ATTGCACTTGGGGTTGCATCCATGGAACTTGCTTTGTCAAATCTTCTTTATGCATTTGAT SEQ ID No 27 

I I 

D35-BB7 ATTGCACTTGGGGTTGCATCAATGGAACTTGCATTGTCAAATCTTCTTTATGCATTTGAT SEQ ID No 23 

II 

D177-BA7 ATTGCACTTGGGGTTGCATCCATGGAACTTGCTTTGTCAAATCTTCTTTATGCATTTGAT SEQ ID No 25 

D144-AE2 ATTGCACTTGGGGTTGCATCCATGGAACTTGCTTTGTCAAATCTTCTTTATGCATTTGAT SEQ ID No 2 9 

******************** *********** ******* ***************** 

D56A-AB 6 T GGGAGTTGCC T TAT GG AGTG AAAAAAGAAGACATC GACACAAACGTT AGGCCTGGAATT 

III | II 

D35-BB7 TGGG A GTTACCTTTTGGAATGAAAAAAGAAGACATTGACACAAACGCC AGGCCTGGAATT 

I I I I II 

Dl 7 7-BA7 T GGGAGT T ACC T TAC GGAGTGAAAAAAGAAAACATTGACACAAATGTC AGGCCTGGAATT 

II III 
D14 4-AE2 T GGGAGT TGCC TT AT GG AGTGAAAAAAGAAGACATC GACACAAACGTT AGGCCTGGAATT 

******** **** *** *********** **** ******** * ************ 

D56A-AB6 GCCATGCACAAGAAAAACGAACTTTGCCTTGTCCCAAAAAA-TTATTTATAA 

II II I 1 I I 

D3 5 -BB7 ACC AT GC AT AAGAAAAACGAACTTTATCTTATCCCTAAAAA-TTATCT AT AG 

I I I 

Dl 7 7-BA7 ACCATGCATAAGAAAAACGAACTTTGCCTTATCCCTAGAAA-TTATCTATAG 

i I I III 1 Mill)' 

D 1 4 4 -AE2 GCC AT GCACAAGAAAAACGAACTTTGCCTTGTCCCAAAAAAATTATTT AT AAATTAT 

******* **************** *** **** * *** **** **** 



D56A-AB6 

D35-BB7 

D177-BA7 

D144-AE2 
I 

PERCENT IDENTITY OF GROUP 5 



immmiiiiiiitiMiii 

ATTGGGACGTGGATCTCATGCTAG 



D56A-AB6 



D35-BB7 



D56A-AB6 
D35-BB7 
D14 4-AE2 
D177-BA7 



90.6 
*** 



D144-AE2 



97.1 
87.7 
*** 



D17 7-BA7 



91.8 
93.0 
88. 9 
*** 



SEQ ID No 27 
SEQ ID No 23 
SEQ ID No 29 
SEQ ID No 2 5 



ALIGNMENT OF GROUP 6 



D56-AG11 
D179-AA1 



ATT TC GT TTGGTTT A GC TAATGCTT AT TTGCC AT TGGCTCAATTACTTT A TCACTTTG AT 

I II 
ATT TC GTTTGGCT T AGC T AAT GC TT AT TT GCC AT TG GCTCAATT AC T AT ATC AC TTCG AT 
*********** *********************************** ******** *** 



D56-AG11 
D17 9-AA1 



TGGGAACTCCCCACTGGAATCAAACCAAGCGACTTGGACTTGACTGAGTTGGTTGGAGTA 

I II I 

TGGAAACTCCCTGCTGGAATCGAACCAAGCGACTTGGACTTGACTGAGTTGGTTGGAGTA 
*** ******* ******** ************************************** 



D5 6-AG1 1 ACTGCCGCTAGAAAAAGTGACCTTTACTTGGTTGCGACTCCTTATCAACCTCCTCAAAACTGA 

D17 9-AA1 ACTGCCGCTAGAAAAAGTGACCTTTACTTGGTTGCGACTCCTTATCAACCTCCTCAAAAGTGA 
*********** ************************************************ *** 



WO 2004/035745 



98/111 



PCT/US2003/032722 



FIGURE 151: COMPARISON OP SEQUENCE GROUPS 

PERCENT IDENTITY OF GROUP 6 

SEQ ID No 31 SEQ ID No 33 

D5S-AG11 D17 9-AA1 

D56-AG11 *** 95.6 SEQ ID NO 31 

D179-AA1 * + * seq ID No 33 



ALIGNMENT OF GROUP 7 

D56-AC7 ATGCTATTTGGTTTAGCTAATG2TGGACAACCTTTAGCTCAGTTACTTTATCACTTCGRT SEQ ID No 35 

D14 4-AD1 ATGCTATTTGGTTTAGCTAATGTITGGACAACCTTTAGCTCAGrTACTTTATCACTTCGAT SEQ ID No 37 

** **************************************************** ****** 

D 56 -AC7 T GGAAACTCCC TAATGGACAAAGTCAT GAGAATT TCGACATGAC 3GAGTCACCT GGAATT 

1 lit 

D14 4-AD1 TGGAAACTCCCTAATGGACAAACTCACCAAAATTTCGACATGAC'TGAGTCACCTGGAATT 
********************** *** * ****************************** 

D56-AC7 TCTGCT ACAAGAAAGGATGATCTTGTT TTGATTGCCAC TCCT TATGAT TCTT ATTAA 

I ) « {Delated; 



Dl 4 4-AD1 TCTGCTACAAGAAAGGATGATCTTArTTTGATTGCCACTCCTGCTCArTCTTGA 
************************ ***************** * ****** 

jp Urgent identity of group i 



' [ Deleted; 



D14 4-AP1 D56-AC7 

D144-AD1 *** 94.3 SEQ ID No 37 

D56-AC7F *** SEQ ID No 35 



ALIGNMENT OF GROUP 9 

D1B1-AB5 ATGTCGTTTGGTTTAGTTAACACTGGGCATCCTTTAGCTCAGT1GCTCTATTTCTTIGAC SEQ ID No 41 

I 1 i 

D73-AC9 ATGTCGTTTGGTTTAGTTAACACAGGGCATCCTTTAGCCCAGTTGCTCTATTGCTTTGAC SEQ ID No 43 

*********************** ************** ************* ******* 

D 1 8 1 - AB 5 T GGAAATTCCC TCATAAGG2TAAT GC AGCTGATTTTCACACTAC TGAAACAAGT AGAGTT 

1 I I I i ! 

D 7 3 - AC 9 T GGAAACTCCC T G AC AAGGTTAArGCAAATGATTTTCGCACTAC TGAAACAAGTAGAGTT 

****** ***** * ************ ******** ********************** 

Dial - AB 5 TTTGCAGCAAGCAAAGArGACCTCrACTTGATTCCAACAAATCACATGGAGCAAGAGTAG 

I ! ) 

Dl 3-AC9 TTTGCAGCAAGCAAAGATGACCTCTACTTGATTCCCACAAATCACAGGGAGCAAGAATAG 
*********************************** ********** ********* *** 

PERCENT IDENTITY OF GROUP 9 

Dlgl-aB5 D73-Agg 

D181-J®5*** 92.8 SEQ ID No 41 

D73-AC9 *** SEQ ID No 43 

T-J _ 



Deleted: % 
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FIGURE 151: COMPARISON OF SEQUENCE GROUPS 

ALIGNMENT OF GROUP 11 

D58-AB9 ATGAC T TATGC AT TGCAAGTGGAAC ACCT AACAATGGC ACAT TT GATCCAGG GT TTCAAT SEQ ID No 47 

I 

D56-AG9 ATGAC TTATGC AT TGCAAGTG GAACACCTAACAATGGC ACAT TTAATCCAGGGT TTCAAT SEQ ID No 49 

I | 
D35-BG11 ATGAC TTATGCATTGCAAGTGGAACACTTAACAATGGCACATTTGATCCAAGGTTTCAAT SEQ ID No 53 

111 | 
D34-25 ATGAC TTAT GC AT TACAAGTGGAACACCTAACAATAGC ACAT TT GATCCAGGGT TTCAAT SEQ ID No 63 

I I I I 

D35-BA3 ATGAC TTATGC AT TGCAAGTGGAAC AC TTAACAATGGCACATTTGATCCAAGGT TTCAAT SEQ ID No 57 

I I I 1 

D34-52 ATGAC TTATGC AT TACAAGTGGAACACCTAACAAT A GC ACAT TT GATCCAGGGT TTCAAT SEQ ID No 61 

I I I 

D56-AG6 ATGACTTATGCATTGCAAGTGGAACACCTAACAATGGCACATTTAATCCAGGGTTTCAAT SEQ ID No 51 

I I | 

D35-42 ATGAC TTATGC AT TGCAAGTGGAACACTT AACAATGGC ACAT TTGATCCAAGGT TTCAAT SEQ ID No 55 

1 II 1 

D34-57 ATGAC TTAT GC AT T ACAAGTGG AAC ACCT AAC AATAGCACAT TT GATCCAGGGT TTCAAT SEQ ID No 59 

************** ************ ******* ******** ***** ********* 

D58-AB9 T AC AGAACTCC AACT G ATGAGCCCT TGGATAT GAAAGAAGGT GC AGGC AT AACT AT AC GT 

III I 
D 5 6-AG9 T AC AAAACTCCAAAT GACGAG GCCT TGGATAT GAAGG AAGGT GC AG GC AT AACT ATAC GT 

I I 

D35-BG11 TAC AG AACTCC AAAT GACGAGCCCT TGGATAT GAAGGAAGGT GCAGGC AT AACT ATAC GT 

I I tl 

D 3 4 -2 5 TAC AAAACTCCAAAT GACGAGCCCC TGGATAT GAAGGAAGGT GC AGGAT T AACT AT AC GT 

I 1 II 

D35-BA3 TAC AG AACTCC AAAT GACGAGCCCT TGGATAT GAAGGAAGGT GCAGGC AT AACT AT AC GT 

I || 
D34 -52 TACAAAACTCCAAATGACGAGCCCTTGGATATGAAGGAAGGTGCAGGATTAACTATACGT 

I | | 

D56-AG6 TACAAAACTCCAAATGACGAGGCCTTGGATATGAAGGAAGGTGCAGGCATAACAATACGT 

1 ) I 

D3 5 - 4 2 TACAGAACTCCAAAT G AC GAGCCCTTGGAT AT GAAGGAAGGT GC AGGC AT AACT ATAC GT 

I II 1 

D3 4 ~ 57 TAC AAAACTCCAAAT GACGAGCCCT TGGATAT GAAGGAAGGT GCAGGATTAACC ATAC GT 

**** ******** *** *** ** ********** *********** **** ****** 

D58-AB9 AAGGTAAATCCTGTGAAAGTGATAATTACGCCTCGCTTGGCACCTGAGCTTTATTAA 

till 
D 5 6 -AG9 AAGGT AAATCC TGTGGAACTG AT AATAGCGCC TC GC CT GGCACC TGAGCTTT AT T AA 

D35-BG1 1 AAGGTAAATCCTGTGGAACTGATAATAGCGCCTCGCCTGGCACCTGAGCTTTATTAA 

I II 111 

D3 4 - 2 5 AAAGTAAATCCTGTAGAAGTGACAATTACGGCTCGCCTGGCACCTGAGCTTTATTAA 

I II II I 

D35-BA3 AAGGT AAATCC TGCGGAACTGATAATAGCGCCTCGCCTGGCACCTGAGCTTTATTAA 

I HI III 

034 -52 AAAGTAAATCCTGTAGAAGTGACAATTACGGCTCGCCTGGCACCTGAGCTTTATTAA 

I I I 1 I I I | 

D56-AG6 AAGGT AAATCC AGTGGAATTGATAATAACGCCTCGCTTGGCACCTGAGCTTTACTAA 

) I I till { 
D35-42 AAGGT AAATCC TGTGGAACTGATAATAGCGCCCC TGGCACCTGAGCTTTATTAA 

I I I III I I I I I 

034 -57 , AAAGT AAATCC T GT AG AAGT GACAAC TACGGC TC GCCT GGCACC TGAGCT TT AT T AA 

** ******** * ** *** ** ** * * **************** *** 



WO 2004/035745 
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FIGURE 151: COMPARISON OP SEQUENCE GROUPS 

PERCENT IDENTITY OF GROUP 11 

D58-AB9 D5S-J¥36 D35-42 D34-57 D34-25 
D56-£G9 D35-BGL1 D35-B&3 D34-52 



D5S-JAB9 *** 

D35-BG11 

D35-42 

D35-BA3 

D34-57 

D34-52 

D34-25 



93.8 93.2 94.3 90.8 93.2 90.9 92.0 91.5 

*** 96.6 97.2 94.2 96.6 91.5 92.6 92.0 

*** 93.8 90.2 92.6 90.3 90.9 90.3 

*** 97#1 SSA 3QmS 320 91 5 

*** 96.5 87.3 88.4 87.9 

*** 90.3 91.5 90.9 

*** 98.9 98.3 

*** 99.4 

*** 



SEQ ID No 47 
SEQ ID No 49 
SEQ ID No 51 
SEQ ID No 53 
SEQ ID No 55 
SEQ ID No 57 
SEQ ID No 59 
SEQ ID No 61 
SEQ ID No 63 



ALIGNMENT OF GROUP 14 

D17 7-BD7 ATTAATTTTTCAATACCACTTGTTGAGCTTGCACTTGCTAATCTATTGTTTCATTATAAr SEQ ID No 83 

ATTAATTTTrCAATACCACTTGTTGAGCTTGCACTTGCTAATCO'ATTGTTTCATTATAAT SED ID No 69 
****************** ****************************************** 



D177-BD5 



D17 7-BD7 
D177-BD5 

D177-BD7 

D177-BD5 
I 



TGGTCACTTCCTGAGGGGATGCTACCTAAGGATGTTGATATGGAAGAAGCTTTGGGGATr 
1 I 

TGGTCACTTCCTGAAGGGATGCTAGCTAAGGATGTTGATATGGAAGAAGCTTTGGGGATr 
************** ********* i*********,***^*****^^^*^^^^.^ 

ACCArGCACAAGAAATCTCCCCrTTGCTTAGTAGCTTCTCATTATAACTTGTTGTGA 

ACCAT GCACAAGAAATCrCCCCrTTGCTTAGTAGCTTCTCATTATA-CTTGTTGA- - 
i***************************************^**^* 



PERCENT IDENTITY OF GROUP 14 



D177-BD7 
D177-BD5 



D177-BD7 



D17 7-BD5 



96.0 SEQ ID No 83 
SEQ ID No 69 



D5 6 A- AGIO ATGCAACTTGGGCTTTATGCArTGGAAATGGCTGTGGCCCATCrrCTTCATTGTTTTACT SEQ ID No 71 

ATGCAACTTGGGCTTTATGCATTGGAAATGGCTGTGGCCCATCrrCTTCATTGTTTTACT SEQ ID No 75 

I I II 

ArGCAACTTGGGCTTTATGCATTAGAAATGGCAGTGGCCCATCTTCTTCTTTGCTTTACT SEQ ID No 73 
*********************** ******** **************** *** ****** 

D5 6A-AG1 0 rGGGAATTGCCAGATGGTATGAAACCAAGTGAGCTTAAAATGGATGATATTTTTGGACTC 

D 5 8 -AD1 2 T GGGAAT TGCC AGAT GGTATGAAACC AAGTGAGC TT AAAATGGATGATATT TTT GGACTC 

D58-BC5 T GGGAAT TGCC AGAT GGTATGAAACC AAGTGAGCTTAAAATGGAT GAT ATTTTTGGAC TC 

*************** ****************** j,**.***.*** 4 .^ + + ^ Jp + Jlr+J(r AJ(JrAAilr + 4 

D56A-AG10 ACTGCTCCAAAAGCTAATCGACTCGTGGCTGTGCCTACTCCACGl'TTGTTGTGTCCCCTT 
I 

D58 — AD1 2 ACTGCTCCAAGAGCTAATCGACTCQTGGCIGTGCCTACTCCACGTTTGTTGTGTCCCCTT 

D58 -BC5 ACTGCTCCAAGAGCTAATCGACTCGTGGCTGTGCCTAGTCCACGTTTGTTGTGCCCACTT 
********** ************************** *************** ** *** 



D59-AD12 
D58-BC5 



Deleted: « 
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FIGURE 151: COMPARISON OF SEQUENCE GROUPS 

D56A-AG10 TATTAA 

D5Q-AD12 TATTAA 

D58-BC5 TATTAA 
****** 

PERCENT IDENTITY OF GROUP 15 



D5SA-AG10 D58-AD12 D5a-BC5 

DS6A-AG10 *** 99.5 95.7 SEQ ID No 71 

D58-ADI2 *** 96.2 SEQ ID No 75 

D58-BC5 *** SEQ ID No 73 

ALIGNMENT OF GROUP 16 

D56-AD6 ATGCTTTGGAGTGCGAGrATAGTGCGCGTCAGCTACCTAACTTGTATTTATAGATTCCAA SEQ ID No 67 

D56-AC11 ATGCTTTGGAGTGCGAGTATAGTGCGCGTCAGCTACCTAACTTGTATTTATAGATTCCAA SEQ ID No 77 

035-39 ATGCTTT GGAGTGCGAGTATAGTGC GC GTCAGCT ACCTAACTTGIATT TATAGATT CCAA SEQ ID No 7 9 

I 

D5B-BH4 ATGCTTTGGAGTGCGAGTATAGTGCGC GTCAGCT ACCTAACCTGTATT TATAGATT CCAA SEQ ID No SI 

★ A*********.****************************** ****************** 

D56-AD6 GTATATGCTGGGTCTGTGICCAGAGTAGCATGA 

D^6-AC11 GTATATGCTGGGTCT GTGTTCAGAGTAGCATGM) 3 5—3 9 /— 

GTATATGCTGGGT CTGTGTTCAGAGTAGCAT GA " 7 Deleted? 1 

D58-BH4 GTATATGCTGGGTCTGTGTTCAGAGTAGCATGA 
******************* *****+******* 

PERCENT IDENTITY OF GROUP 16 



D56-AC11 D56-AD6 D5B-BH4 D35-39 

D56-AC11 *** 98.7 98.7 98.7 SEQ ID No 77 

D56-AD6 *** 98.7 96.7 SEQ ID No 87 

D58-BH4 *** 9 B .7 SEQ ID No 81 

D35-39 *** SEQ ID No 79 

ALIGNMENT OF GROUP 17 

D7 3A-AD6 CTGAATTTTGCAATGTTAGAGGCAAAAATGGCACTTGCATTGArTCTACAACACTATGCT SEQ ID No 89 

D7 0A-BA11 C TGAATT TTGCAATGTTAGAGGC AAAAAT GGCACTT GCATTGATTCT ACAAC ACT ATGCT SEQ ID No 91 

********************************************** + + 

D73A-AD6 TTTGAGCTCTCTCCATCrTATGCACATGCTCCTCATACAATTATCACTCTGCAACCTCAA 

1 

D70A-8A1 1 T TT GAGCTCTC TCCATCTTATGCAC AC GC TCC TC ATAC AATTATCACTCTGCAACC TCAA 

************************** ***************** **************** 

D73A-AD6 
D70A-BA11 

********************************* 

I 



WO 2004/035745 



102/111 
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FIGURE 151: COMPARISON OP SEQUENCE GROUPS 



:rcent identity of group ii 



D73A-AD 7 0A-BA11 
D73A-AD6 *** 99.3 SEQ ID No 8 9 

D70A-BA1 1 , SE Q ID No 91 

ALIGNMENT OF GROUP 18 

D70A-AB5 CAAAACTTCGCGATTTrGGAAGCAAAAATGGCTA'rAGCTATGATTCTACAACGCTTCrCC SSQ ID No 95 

CAAAACTTCGCGATrrTGGAAGCAAAAATGGCTATAGCTArGATTCTACAACGCTTCTCC SEQ ID No 97 

TTCGAGCTCTCCCCATCTTAXACACACTCTCCATACACTGTGGTCACTTTGAAACCCAAA 

I 

TTCGAGCrcICTCCATCTTArACACACTCTCCATACACTGrGGTCACTTTGAAACCCAAA 
***************************** **************** ************** 



D70A-AA8 

D70A-ABS 
D70A-AA8 



D7j0A-AB5 
D70A-AA8 



TATGGTGCTCCCCTAATAATGCACAGGCTGTA^ 
TATGGTGCTCCCCTAATAArGCACAGGCTGTAG " 
********* ************************ 



:rcent identity of group ig 

D7QA-AB5 D7QA-AA8 



D70A-AB5 *** 
D70A-AA8 



99.6 

*** 



SEQ ID No 95 
SEQ ID No 97 



Deleted: 3 



D7 OA-ABB CAAAATTTTGCCATGTTAGAAGCAAAGATGGCTCTGTCTATGArCCTGCAACGCTTCTCT SEQ ID No 99 

III III 1 || 1 1 

D7 0A-EH2 
D7 0A-AA4 

D7 OA-ABB 
D70A-BH2 
D70A-AA4 • 

D70A-AB8 

D^0A~BH2 
D7QA-AA4 



Ml III 1 || , 

ATAAACTTTGCAATGACAGAAGCGAAGATGGCTATGGCTATGArTCTGCAACGCTTCTCC SEQ ID No 101 

A TAAACT TT GCAAT GGC AG AAGCGAAGATGGC T ATG GCTATGAT TCTGCAAC GCTTCTCC SEQ ID No 103 
*** ***** *** ****** **** ** ** ******* ** *********** U 10,5 

TTTGAACTGTCrCCGTCrTATGCACATGCCCCTCAGTCCATArTAACCGT-CAGCCACAA 

11 I I II I I 11)11 

T TTGAGCTATC TCCATC TTACACACATGCTCC ACAGTC TGTAAT AACTATGC AACCCCAA 

TTTGAGCTATCrCCATCTTACACACATGCTCCACAGTCTGTAATAACTATGCAACCCCAA ' 
***** ** ***** ***** ******* ** ***** ** + *** * ** ## 

TATGGTGCTCCACTTATTTTCCACAAGCTATAA 

I II II I 

TAT GGT GCTCCTCTTATATTGCACAAATTGTAA, 

TATGGTGCTCC TC TTAT ATTGCACAAAT TGTAA * 

*********** ***** ** ** ** * ** 



Delated: <i 



PERCENT IDENTITY OF GROUP 19 

D70A-^B9 D7QA-AA4 D7QA.-BH2 



D7QA-.BB8 
D7GA-7AM 
D7QA.-BH2 



77.8 



77.8 
99.3 



SEQ ID No 99 
SEQ ID No 101 
SEQ ID No 103 
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D70A-BA1 
D70A-BA9 

D70A-BA1 
D70A-BA9 

D7DA-BA1 
D70A-BA9 



FIGURE 151: COMPARISON OP SEQUENCE GROUPS 

CAAAACTTTGCAATGATGGAAGCAAAAATGGCAGTAGCTA1GATACTACAAAAATTTTCC SEQ ID No 105 



CAAAACTTTGCAATGATGGAACCAAAAATGGCAGTAGCTATGATACTACATAAATTTTCC SEQ ID No 107 

********* 

TTTGAACTATCCCCTrCTTATACACATGCTCCATTTGCAATTGTGACTATTCATCCTCAG 

TTTGAACTATCCCCrTCTTATACACATGCTCCATTTGCAATTGrGACTATTCATCCTCAG 
****************************************^********^ J(++ + ilk + ^^ 

TATGGTGCTCC TC IGCT TATGCGCAGACTTTAA 

TATGGTGCTCC TCTGCT TATGCGCAGACTTTAA 
***************************.*.****.* 



PERCENT IDENTITY OF GROUP 20 



D7QA-BA1 
D7 0A-BA1 *** 
D7 0A-BA9 



D70A-BA9 



99.4 
*** 



SEQ ID No 105 
SEQ ID No 107 



I 



OF CROUP 22 

0141 ' AH1 TAT AGCTTGGG GC TCAAGGAGAT TCAAGCTAGCT TAGCTAATCTTC TACATG GATT TAAC SEQ ID No 113 



D34-65 
D181-ACS 

D144-AH1 

D34-S5 

D181-AC5 

D144-AH1 

D34-S5 

D181-AC5 

D14 4-AH1 

D34-65 

D181-AC5 



CATAGCTTGGGGCTCAAGGTGATTCAAGCTAGCTTAGCTAATCTTCTACATGGATTTAAC ' SEQ ID No 115 

TATAGCATGGGGCTCAAGGCGATTCAAGCTAGCTTAGCTAATCTTCTACATGGATTTAAC SEO ID No 111 
*****.************ *****************************^ w *^^ V "° ■ L - LJ " 

TGGTCATTGCCTGATAATATGACTCCTGAGGACCTCAACATGGATGAGATTrTTGGGCTC 
TGGTC AT TGCCTGATAATATGAC TCCTGAGGACC TC AACATGGATGAGATTTTT GGGC TC 

TGGTCATTGCCTGATAATATGACTCCrGAGGACCTCAACATGGATGAGATTrTTGGGCTC 
************ + ***** f ^ # + + * # *^ #+iA ^ + + ^ + ^^ # + + ^^^ + i + i + i++ ^^^^^ i(f + ^ # 

TCTACACCTAAAAAATTTCCACTTGCTACTGTGATTGAGCCAAGACTTTCACCAAAACTT 

TCTACACCTAAAAAATTTCCACTTGCrACTGrGATTGAGCCAAGACTTTCACCAAAACTT 

TCTACACCTAAAAAAT TTCCACT TGCTACTGT GATT GAGCCAAGACTTTCACCAAAAC TT 
********************* + ^ + + + 

TACTCTGTTTGA 

TACTCTGTTTGA 

TACTCTGTTTGA 
************ 



PERCENT IDENTITY OF GROUP 22 



P34-65. 



D34-65 *** 
D181-iC5 
D144-J«L 



D181-.ZE5 D144-.SH1 

98.4 99.0 SEQ ID Nd 115 

99.0 SEQ ID No 111 

*** SEQ ID No 113 



D58-AA1 TTGGGCTTGGCAACGGTGCATGTGAATTTGATGTTGGCCCGAATGATTCAAGAATTTGAA SEQ ID No 121 



__J 
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D18 5-BC1 
D18 5-BG2 

D58-AA1 

D185-BC1 

D185-BG2 

D58-AA1 
D18 5-BC1 
D185-BG2 
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FIGURE 151: COMPARISON OP SEQUENCE GROUPS 

I 

TTGGGCTTGGGAACGGTGCATGTGAATTTGATGTTGGCCCGAACGATTCAAGAATTTGAA SEQ ID No 133 

I 

TTGGGCTTGGCAACGGTGCATGTGAATTTGATGTTGGCCCGAATGATTCAAGAATTTGAA SEQ ID No 135 

*********************** ********* *********** -*--*-*^r-** -4T-* **-*>r*r-«r-* 

TGGTCCGCTrACCCGGAAAATAGGAAAGTGGATTTTACTGAGAAATTGGAATTTACTGTG 

TGGTCCGCTTACCCGGAAAATAGGAAAGTGGATTTTACTGAGAAATTGGAATTTACTGTG 

I 

TGGTCCGCTTACCCGGAAAATAGGAAAGTGGATTT-ACTGAGAAATTGGAATTTACTGTG 
***** ****************** ************ ************************ 

GT GAT GAAAAATCCT TT AAG AGC T AAGGTCAAGCC AAGAATGCAAGTGGTGT AA 
1 

GTGATGAAAAACCCTTTAAGAGCTAAGGTCAAGCCAAGAATGCAAGTGGTGTAA 
I I I I I I i ! I H I I I I I I I i I I I I I I [ I I I I I I ! I I I 1 1 I I I II I I I I 1 I I 

GTGA 

* ** * 



P$F .. r IDENTI1 OF GROUP 25 

D185-BG2 



D58-AA1 
D58-AA1 *** 
D185-BG2 
D185-BC1 



95. 9 
*** 



D185-BC1 



98.9 SEQ ID No 121 
95.1 SEQ ID No 135 
*** SEQ ID No 133 



aCJjQEffiNr CF GROUP 28 

D17 7-BF7 ATCACATTTGCTAAGTTTGTGAATGAGCTAGCATTGGCAAGATTAATGTTCCATTTTGAT SEQ ID No 127 

D185-BD2 ATCACATTTGCTAAGTTTGTGAATGAGCTAGCATTGGCAAGATTAATGTTCCATTTTGAT SEQ ID No 139 

D1B5-BE1 ATCACATTTGCTAAGTTTGTGAATGAGCTAGCATTGGCAAGATTAATGTTCCATTTTGAT SEQ ID No 137 

************* *********************************************** 



D177-BF7 
D185-BD2 
D185-BE1 

D177-BF7 
D18 5-BD2 
D185-BE1 



TXCTCGCTACCAAAAGGAGTTAAGCATGAGGATTTGGACGTGGAGGAAGCTGCTGGAATT 

I 

TTCTCGCTACCAAAAGGAGTTAAGCATGCGGATTTGGACGTGGAGGAAGCTGCTGGAATT 

I 

TTCTCGCTACCAAAAGGAGTTAAGCATGAGGATTTGGACGTGGAGGAAGCTGCTGGAATT 
**************************** ******************************* 



ACTGTTAGAAGGAAGTTCCCCCTTTTAGCCGTCGCCACTCCATGCTCGTGA 
I 

ACTGTTAGGAGGAAGTTCCCCCTTTTAGCCGTCGCCACTCCATGCTCGTGA 
******** ********************************* ********* 



PilRCENT IDENTITY OF GROUP 28 



D177-EF7 D185-ED2 D185-BE1 

D177-EF7 *** 99.4 99.4 

D185-ED2 *** 98.8 
D185-BE1 



SEQ ID No 127 
SEQ ID No 13 9 
SEQ ID No 137 



ALieaMgqr CF gras p 3Q 
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D70A-AA12 
D176-BF2 

D70A-AA12 
D176-BF2 

D7 0A-AA12 
D17 6-BF2 



FIGURE 151: COMPARISON OF SEQUENCE GROUPS 



ATGTCATTTGGTTTAGCTAATCTTTACTTACCATTGGCTCAATTACTCTATCACTTTGAC SEQ ID No 131 

) I I I ) It III I I 

ATATCAT TTGGTT TGGCTAAT GT TT AT TTGCC AC T AGC TCAATT GT T AT ATC AT TTTG AT SEQ ID No 85 
+ * **** ** •* + + * *★**★ + ** * ★ ***** 



TGGAAACTCCCAACCGGAATCAAGCCAAGAGACTTGGACTTGACCGAATTATCGGGAATA 






D176-BF2 



D7QA-AA12 



D176-BF2 
D7 0A-AA12 



77.0 



SEQ ID No 85 
SEQ ID No 131 
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Figure 153: Cloning of Cytochrome P450 cDNA Fragments by PCR 
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